Interview with Marcel on Recognize AI
Ep. 06

Interview with Marcel on Recognize AI

Episode description

(00:00)

Welcome to our first long format interview! Consider this a bonus episode. Please share it with others if you enjoy it! Let me know what you think; your feedback appreciated.

(00:20)

LinuxFest Northwest in Bellingham, WA April 25th - 27th

(00:37)

Quick Intro on Marcel - Developer behind Nextcloud Bookmarks, Floccus, Recognize

(01:04)

Recognize AI & ML for Nextcloud Photos documentation

(02:30)

Floccus - Browser Bookmark Syncing Extension for Chrome, Firefox, mobile clients, etc. Supports Nextcloud Bookmarks, Google Drive, Git, webdav and more.

(02:54)

Be sure to send in your feedback with this anonymous form!

(03:33)

Spread the word and share this show with others if you enjoy it! Thank you so much!

(03:45)

Interview with Marcel Begins

Beatles use AI to complete a new song

Nextcloud Assistant

Summary Bot for Nextcloud Talk Chat

What are Common AI Models & How to Use Them

Ollama, supporting Deepseek and other kinds of models, from small to large.

Perplexica AI Search

Hope you enjoyed this first interview.

Download transcript (.srt)
0:00

Welcome back to Linux Prepper.

0:12

On the docket today we've got the first long format style interview with another person.

0:18

Like to start off by saying Linux Fest Northwest is coming up April 25th to the 27th. Also, if you like the show, please do share it broadly with anyone you think

0:29

might be interested. I'd really appreciate that.

0:32

It's always nice to grow the audience and grow the show. So share the show if you

0:36

can. That said, I will be interviewing Marcel. Today, Marcel is an employee of

0:43

Next Cloud. He is a developer of Flocas,

0:47

as well as the maintainer for some years now of NextCloud Bookmarks and works, I believe,

0:52

on integrations at NextCloud. And we're going to be talking about another project Marcel

0:57

developed, which is called Recognize. So what is Recognize?

1:05

So, going off of the admin documentation, recognize, provides media tagging and facial

1:10

recognition functionality for the Photos app.

1:14

Recognize can group similar faces on users' photos, known as face recognition.

1:18

It adds fitting tags to photos detecting landscapes, food vehicles, buildings, animals, landmarks,

1:24

monuments.

1:25

It can recognize music genres and audio files, add tags for those.

1:29

It can recognize human actions on video files and take that as well.

1:32

It specifically runs open source models and does so entirely on premises.

1:37

And that is the basis of recognize.

1:40

It is a little bit of a demanding thing to run.

1:43

So basically it plugs into the photos app, right?

1:46

It runs on the back end.

1:48

Video card is supported, not required,

1:51

but obviously things will be slower without it.

1:52

This is x86 only.

1:55

And it looks like it requires four gigabytes of RAM minimum

2:00

just for itself.

2:02

The more CPU cores you have, the better

2:04

says it is recommended to have 10 to 20. minimum just for itself. The more CPU cores you have the better,

2:05

says it is recommended to have 10 to 20.

2:08

And disk space for the models is about

2:11

one and a half gigabytes.

2:14

So this is an example of a machine learning

2:16

or as we know at AI app.

2:18

That said, let's get to the interview.

2:20

I'll be talking with Marcel.

2:23

I'm just gonna be focusing on Recognize and AI.

2:26

We've also been discussing Flocus and NextCloud bookmarks,

2:30

but it's just becomes too long, so that will be released at a different time.

2:36

So, anyway, I hope you enjoy this kind of unusual style of release for Linux Prepper,

2:43

doing more of a long format interview.

2:45

Be sure to tell me what you think you can send in your feedback to podcast@james.network

2:51

or you can go on to discuss.james.network and post on the forum.

2:56

Otherwise, I'll include an anonymous form in the show notes.

3:00

You're welcome to fill that out anytime.

3:02

Give me your thoughts whether you like this format.

3:04

If you have suggestions and coming up, up, we'll be returning to more standardized release of

3:10

the show. Consider this more or less a bonus episode. Because of that, I'm not including any

3:16

sort of sponsorship related to this episode. I just wanted to get the start to get these long

3:21

format interviews out of the pipe because I have kind of have

3:25

a huge backlog and I want to start releasing them. So I do hope you enjoy this special

3:29

release of Linux Prepper and please do share this small podcast with other people if you

3:34

find it useful and interesting. Thank you so much. Enjoy.

3:37

I thought it would be fun to record a little segment with you just to get your thoughts

3:42

on things. You just kind of have your hands involved in a lot of things

3:45

and related to like kind of integrating tools together

3:49

and being involved with Recognize AI tool.

3:52

And I was curious if you could kind of give some more thoughts

3:55

related to that.

3:56

Because I think it's kind of a unique opportunity

3:58

to talk to you.

3:58

I guess the first question I--

4:00

Definitely.

4:01

Yeah, OK, cool.

4:02

I just have to preface this maybe with the fact that I'm not in any way. Yeah. I don't see myself as an expert on most of the things that I work with. So I'm, I dabble and I try my best but I'm not a, I don't know, I don't see myself as an AI expert maybe or stuff like that.

4:27

But I can... I have my opinions.

4:29

Yeah. So then I guess it's like, well, what do you see yourself as? Like, let's talk about

4:33

the recognized photos. Like, did you see that as AI work? Or do you not even see it as AI work?

4:38

Like, what... Where do you think? What do you think about it?

4:41

I think it's... It's probably mostly... There is this thing in the AI world that is called Machine Learning Engineering.

4:52

And that's probably what it most closely relates to.

4:55

So it's kind of like making AI models available to a user base.

5:01

You know, that's mostly what I do also at NextCloud. We don't develop our own

5:08

AI models because we have just not the manpower or women power. We're just

5:17

especially in the AI department or integrations team. is. People probably from the outside think we have like tons

5:27

of people working on these things. But in reality, the proportion is much worse from when you see in

5:38

our marketing how big the AI features are presented and the people that work on them is much much less

5:48

work goes into them than how they are presented.

5:53

Does that mean do you think like I mean obviously there's a lot of granularity and different

5:58

approaches in regards to machine learning that's always the way I heard it described

6:02

to sorry can you repeat yeah in the, I always heard this described as machine learning and never is AI.

6:09

Yeah, oh yeah, that's a shift, I guess. So traditionally, anything that is involving involving kind of intelligent algorithms.

6:25

It's called AI and machine learning is when you have

6:30

some sort of model that learns from existing data

6:37

to extrapolate to new data.

6:39

That's kind of it.

6:41

So most of what we see as AI today is specifically machine learning. It is of

6:47

course AI also. But yeah, even more specifically, people usually say AI when they mean large

6:54

language models. But even that is more narrow in the machine learning, a narrower field

7:02

in the machine learning, a narrower field in the machine learning realm.

7:05

And does that mean that when you use like the recognized photos app, right, in next cloud,

7:12

is that related to a large language model or is it not?

7:16

Like what, you know, what's going on there?

7:19

So recognized does recognize, recognized does not use any large language models.

7:29

I think we're using a photo classification model that's just kind of old by now, but

7:39

it's a model that takes as input an image and spits out one of a thousand categories, or I don't

7:48

know how many there are, but it's over a thousand, I think.

7:54

And these categories can be like dog, bird, tree, whatever.

8:01

That's what it does.

8:02

I see.

8:03

And that's a, that's an image recognition model, basically.

8:06

So it's part of, it's been trained with machine learning, but it's not a large language model

8:11

like a CenGPT, for example.

8:14

Right.

8:15

And so even though it's only working on a small amount of photos, it's still able to

8:19

make those classifications with relative accuracy.

8:22

Yeah, because it doesn't matter how many photos

8:26

you have because the model has been trained on lots of photos before it got to your machine.

8:33

So the model is not trained on your photos, but it just analyzes your photos. There's a

8:39

difference between training and inference. And the training is usually what is even more

8:46

compute intensive than just the inference

8:48

because in training you need a lot of data

8:53

to capture the variety of the space

8:56

you're trying to capture in your model

9:00

so that the model can well generalize well

9:03

to other new phenomena

9:05

Does that mean that training is training not a part of recognize or is there training happening?

9:11

Is it optional?

9:13

There is no training happening and recognize. Hmm. So recognize

9:17

It seems maybe that there it might be training on your data when it does

9:23

Face recognition.

9:25

Yes.

9:26

But that's it's not traditional training.

9:31

I would say it's more it's a clustering algorithm.

9:35

It's a clustering algorithm.

9:36

So it and it tries to detect faces in your photos and then extracts the face information as a

9:46

number sort of and then it clusters numbers together by virtue of whether

9:56

they are similar to each other and the nice thing is that since the numbers

10:02

represent the face information,

10:05

when numbers are close together in this space,

10:08

these are actually faces that are similar. That's how it works.

10:12

Okay. Interesting. And how did you choose the, um,

10:16

obviously you chose an upstream photo project. Um,

10:19

and I'll try to add it to the show notes. Do you want to talk a little bit about

10:22

that? Yeah, we use a, uh use a Google model that was trained by Google.

10:29

It's called EfficientNet.

10:31

It's, at this point, quite old news.

10:37

And the framework we use is TensorFlow, which is also

10:41

a bit on the--

10:44

has some shelf life.

10:47

But yeah, we intend to revamp the recognize app this year, hopefully.

10:53

What about the models makes them old hat?

10:55

I mean, I've heard of TensorFlow and so the Nefeshetnet model, what replaces them

11:01

by other models, just the training.

11:04

So newer models are generally better at recognizing stuff.

11:09

The corpus or the training set that this efficient net model is trained on

11:15

has been extended with more data.

11:18

And that is a new training set than that other models have been trained on.

11:24

But also models have generally been

11:26

become better at recognizing stuff in images by every year by a big margin. So, yeah,

11:38

accuracy is much higher in newer models. But also it would be nice to

11:44

have something that

11:45

probably Google Photos has where you type something. And it

11:49

doesn't need to mention the categories exactly that you

11:53

would find and recognize. For example, in recognize, if you

11:59

want to search your photos for photos of your dog, then you

12:03

have to match the exact tag name

12:06

that recognize gives your photo, gives that photo of a dog, you know.

12:10

You have to match the tag name, you have to enter a dog in order to get your dog photos.

12:17

But maybe you're searching for Terrier or whatever your dog is the breed of your dog and with recognize that won't match anything

12:27

but Google photos for example will find your your dog anyway because it does

12:34

It has a trick up their sleeves where they don't just

12:38

output the raw

12:41

The categories themselves, but the raw the raw numbers again, it's a bit involved.

12:49

But basically you can search in the vector space of the model with your

12:55

search and then you can match many more things than what you would be able to

13:02

match if you just translate your picture into

13:06

a string into a text. I hope this makes sense.

13:09

Yeah, no, it's interesting. And I appreciate you sharing. It makes me curious how you foresee

13:15

future developments, not in a, it doesn't have to be in a realistic way, you know, it could

13:20

be a pie in the sky. But for example, there's two different things

13:25

that are interesting, right?

13:26

Like you're talking about replacing the Google model

13:29

efficient net and TensorFlow.

13:32

So just that, would that mean in newer models

13:36

that not only could you have access to, say, larger models,

13:40

but also you could have access to more efficient, smaller

13:44

models that would be supported on hardware.

13:48

I guess I'm curious about that.

13:53

Yeah, so I can maybe divulge a bit into the details.

14:00

We're trying to build a Python app.

14:04

we're trying to build a Python app. The recognizes traditionally has been a PHP app

14:08

and there's not much possible within PHP natively

14:12

with machine learning.

14:13

So we're transitioning to a Python app

14:16

and hopefully that will give us the ability

14:20

to use newer frameworks and also better models, possibly even smaller models, a better

14:30

or a more nuanced range of models that you can choose which model you want to use would

14:36

be cool.

14:38

Yeah, the possibilities are bound, boundless.

14:43

Yeah, you get all this development on the app.

14:45

Like, what would you see?

14:46

How would you see it working differently in the future?

14:48

Like you said, currently it's with tags.

14:50

How would you imagine it working differently?

14:52

Yeah, that would be cool if we could have this, this vector

14:56

space thing going with, with recognize.

15:01

I'm not sure if it's feasible at this point because it would be

15:07

quite, quite some work. But yeah, the possibility to just enter a text that describes your image, the image

15:13

that you're thinking of, and the image that you're thinking of will pop up then.

15:20

That's I think is really cool.

15:22

And yeah, that would be awesome.

15:24

Yeah. No, that would be awesome. Yeah.

15:25

No, that would be cool.

15:26

I guess since you're involved in such things,

15:28

it makes me curious, too.

15:30

I think AI, you know, going more looking at AI,

15:33

it's self-machine learning in regards to--

15:36

let's keep it with Next Cloud for the moment.

15:38

I mean, where other areas in that project

15:41

that you found interesting developments in using machine

15:46

learning either locally or not that you think are worth mentioning?

15:51

Yeah, I think with NextLoud, since I'm an employee at NextLoud, we've been trying to

15:58

expand on the stuff we bring to the table in terms of AI. And I think what's a good saying for that,

16:06

we've increased the offerings for AI in Excel, definitely,

16:13

I think, which pushed the boundary a bit.

16:16

There is now a large language model app

16:21

that you can use to chat with an AI bot or a large language model like chat GPT does,

16:29

for example. We have a transcription app that allows you to transcribe stuff that you say

16:39

and also talk calls, for example, you know, talk next-law talk you can have

16:47

video calls and

16:49

Record them and

16:51

since a few

16:53

months or years, I don't know you can now also

16:58

Transcribe the recordings of these meetings so you can have a written

17:01

of these meetings. So you can have a written log of what you said

17:06

in which meeting.

17:07

And of course, because if you have the transcription app

17:11

and the large language model app,

17:13

the large language model app can summarize

17:17

the transcription, the transcript of your meeting.

17:20

So you have a meeting, you have a call recording, you can have a transcript,

17:26

and even a shortened summary of the transcript of your meeting. So I think that's pretty cool.

17:34

Yeah, something I was experimenting with with Whisper for the this show, because it's mandatory

17:41

effectively for podcasting and for videos for subtitle type purposes.

17:46

Please continue to.

17:48

Yeah, Whisperer is really cool in that regard. It's really pushed the boundaries for a lot of

17:54

applications to just be able to turn audio spoken words into text. It's brilliant. I love it. Yeah. And also we have image generation. So you can

18:12

generate images with an app also in NextCloud for all sorts of purposes.

18:20

Oh, speaking of that, I want to, I'm going to show you some pictures. Keep telling me, but I'm going

18:27

to throw you some pictures I don't think you've seen. But yeah, say more about the image generation,

18:32

please. Yeah, image generation. Image generation, of course, has a bit of a bad rep because it's

18:39

using all these photos of the internet and works of arts of other people.

18:46

Um, yeah, but all of this is optional, of course.

18:50

So if you don't want to use it, just don't install it.

18:54

And it won't be installed by default.

18:55

I think hopefully ever, I think next loud is sensitive enough to not push

19:03

AI on people if they don't want it.

19:05

That makes sense.

19:06

So take a look at this.

19:07

I tried to generate a next cloud mascot and see it.

19:14

Yeah, it's a cloud that brings at me.

19:19

Here's the cloud.

19:20

You the next cloud mask.

19:21

Exactly.

19:22

This is the second iteration.

19:24

Oh, that's much nicer. That's creepy. -Yeah, first one's terrified.

19:30

-The second one is a drawn image. So a drawing or like a cartoonish

19:35

next slide with the glasses even. That's nice. -Yeah, those are the best ones. I'll leave it there.

19:41

What's all the, all the other there?

19:45

Kind of become more offensive. So.

19:49

Stable diffusion.

19:50

That's what this.

19:51

So yeah, that's what this is.

19:53

Is that what you guys use?

19:57

Yeah, we're using also stable diffusion XL.

20:02

Is the model I think we're using?

20:04

Yeah, then the technology behind it is also really mind-blowing.

20:10

It's basically-- it works differently than large language

20:16

models, and it's fascinating.

20:19

It starts with an image that is completely noise.

20:24

So it's just ditter, gray ditter.

20:29

And then it iteratively changes pixel by pixel,

20:37

the image to look to resemble something

20:41

that you put in as text.

20:43

It's really cool.

20:45

Yeah.

20:47

So how are you guys making use of this image generation?

20:52

Because I mean, I was kind of playing around with it,

20:55

but how would you use it in a more professional capacity?

21:00

It's a good question. I'm not sure if there is a good use for image generation, personally.

21:07

I think it can be useful for things like generating cover art for articles or newspapers, but I don't know. Yeah. I think maybe more interesting is to use these models to edit existing photos.

21:30

For example, if you have a photo where there's something in there that you don't want to be in the photo,

21:37

then you can try to use these models to edit the thing out more efficiently than you could with Photoshop or whatever you want.

21:50

Yeah, that is interesting.

21:52

Did you know about the Beatles song that they just put out this last year?

21:56

Yeah, it's a bit creepy.

21:58

Yeah, but that's what it was.

21:59

I don't know what I think of that.

22:01

I don't know what I think of the song, but that was the way that it was modified, right?

22:07

It's because the recording was unusable.

22:10

My understanding is it was not usable because it was a recording of the singer John playing

22:16

the piano and recording his voice because he was doing it as like a demo at home, right?

22:22

It wasn't meant to be released. So it was only using AI that they

22:26

were able to separate the voice and the piano and everything.

22:30

Okay. Yeah. That's cool. But it was kind of, you know, done in a similar method to remove.

22:39

I thought they had used AI more extensively to generate even the music.

22:43

They did. Yeah, I believe the guitar part was generated and based on recordings or whatever.

22:49

So I'm sure there was more to it than that, but that was the part that at least interested

22:53

me, you know.

22:54

But yeah, song wise, not a song I find myself listening to ever again, but whatever, different.

23:02

Yeah, whatever. I don't know. It just like made

23:09

me realize, you know, you're working on these things, so it's fun to ask you about them.

23:13

Yeah, sure. Yeah. One more thing we have done recently is, which is all the rage nowadays, it's agent support or agency or whatever

23:30

it's called nowadays, which is the ability for a chat model to do things on your behalf

23:38

to actually act in the world, so to speak, in the virtual world, maybe.

23:48

Do you get what I mean?

23:49

- I think so.

23:50

I mean, the thing that I've been wanting,

23:52

at least from like a chat like bot,

23:56

for me, is more of the hands-on help to people

24:00

because a lot of times people ask questions

24:03

that are basically can be answered to them through,

24:05

you know, looking at the documentation

24:08

or things like that where it doesn't really need,

24:11

like I don't really need you to respond,

24:13

it's just kind of, I need to be directed to where to go.

24:17

I don't know if that requires a large language model to do,

24:19

but I think, you know, documentation

24:22

and that kind of guidance is always missing.

24:25

But are you thinking of something a different use?

24:27

Yeah, so the summary or regurgitation, so to speak, of documentation is a, I don't know,

24:37

maybe debatable use of AI because on the one hand it's nice to not have to shift through tons of documentation or text for yourself.

24:49

On the other hand.

24:52

If all we ask or use nowadays is chat GPT for all our questions then we make ourselves we get dependent on on this technology and maybe other services, sites like Wikipedia will

25:12

fade away.

25:13

And then, yeah, it becomes very easy to manipulate people also if you just trust these things

25:19

blindly.

25:20

So I'm a bit skeptical there. But on a small scale, if you just have an AI

25:26

that indexes all your notes and can answer your questions

25:31

about them, I think that can be nice.

25:33

If you still have the pile of notes in the back

25:39

and can access them, I think that's nice.

25:42

I was going for agency, I was going more for this example

25:51

where you want something to be done for you. So for example, yeah, at Next slide, when we think of agency, we think about trying to make it easier for

26:09

the user to complete steps in the user interface, for example.

26:14

So repetitive steps for sharing files with people or creating a talk conversation and emailing somebody about it, you can technically now ask the chat bot

26:29

to do these things for you, you know? So for you, but do things on your behalf.

26:49

So it can add a new talk room to your talk conversations and then add a new participant

26:59

to that talk room. And if you want, it can email the participant about the talk room. And I don't know.

27:12

So, so functioning is like a series of automation of boring tasks, basically.

27:18

Exactly. So we basically it's a kind of a,

27:24

Basically, it's a kind of a, yeah, bit of like a servant, a next out assistant basically,

27:29

like a real assistant that does things for you in next out,

27:33

that's has been our goal to implement that.

27:37

- Does that mean it would integrate with something like

27:39

flow or it would be separate from like flow,

27:42

you know, the actions application?

27:46

- For now it's separate.

27:48

Yeah.

27:49

So it has its own capabilities.

27:54

But they are rather simple to add because we're just using the APIs that we already

27:59

have for clients to connect to NextOut or for apps to integrate with each other.

28:07

We already have all these APIs and now also the chatbot can just access them and do things for you.

28:17

What's important though is that it does not do things just on its own accord.

28:21

So for safety reasons we have a, like like a I don't know what to call it

28:28

We have a screen that pops up when when the chatbot does something wants to do something

28:34

potentially dangerous or

28:39

Destructive and then you have to approve so that's something that I think

28:48

not a lot of implementations of these.

28:54

Chatbots do when you say that, do you mean it would say like the list of actions it's going to do like I'm going to message messages people and share this

28:58

document publicly kind of thing?

29:01

Yeah, exactly.

29:03

That's cool.

29:06

Do you feel like it's possible to run these services on-prem locally, you know, to do

29:12

these different kinds of things, like without having to call out to, you know, external

29:18

service provider like Chad GPT or do you need to use those services?

29:22

Do you know what I mean? What do you need to use those services? Do you know what I mean?

29:25

Like what do you think?

29:28

- I think it's easier to use an API like Chet,

29:32

open AIs, Chet GPT API or whatever it's called,

29:39

makes it very easy to do these to realize these things

29:43

because you just sign up with open AI, you

29:46

get an API key and off you go, you don't need to care about is your GPU large enough? Or

29:53

do you have enough hardware to run these things? Yeah, I think open AI's API makes things much easier, but on the other hand, of course, you also

30:06

buy into all that.

30:10

Yeah.

30:12

The dependency and the tactics of OpenAI, which is not everybody's cup of tea, I guess.

30:21

Maybe it's nicer to do, to just run something locally.

30:28

And I mean, this is the self-hosted podcast Linux prepper.

30:33

Of course, it's also cool if you run things locally.

30:38

It's just more work.

30:44

So you need a GPU

30:45

With enough V-ram that's

30:49

Not very easy to get or at least it's not cheap

30:54

But then if you have a GPU most of the stuff that we implement is

30:59

Should be working on on-prem

31:02

The only thing that we're struggling right now is to make the agency

31:10

feature really work on free and open source language models

31:15

because, yeah, open AI is a bit ahead of open source models, of course,

31:22

with all their money that they have.

31:25

What's been your experience with that?

31:27

I was curious about it, you know, it's because you were talking about like

31:30

summaries earlier, you know, it's something that I've been experimenting

31:35

with.

31:36

So with my previous guest on the show, HB, we've been testing, yeah, we've been testing local language models. Right now,

31:47

it's been like a llama and dolphin mixtrel is what it's called. But how do you find things?

31:55

Is that what you're referring to when you're talking about, you know, is something like

31:59

a llama like either because they have smaller models smaller in this case being something that requires let's say up to

32:06

24 gigabytes of RAM

32:09

is something like that like

32:11

Functionally useful at all like in the enterprise space or it's just not even that kind of models not even that useful. I don't know I

32:21

Mean I'm looking at as a home enthusiast. you know, we're looking at it as enthusiasts,

32:25

I suppose, playing around with business companies.

32:28

Yeah.

32:29

There are models that are 24 gigabytes or less definitely, and they are usable, I think,

32:37

for most things. But yeah, I think on the one hand we have the phenomenon that for quite some time for

32:54

multiple months and even, yeah, I don't know, one and a half years maybe the principle has been scaling up is the way to go.

33:06

So the more parameters a model had, the better it used to be.

33:12

Makes sense.

33:13

And now we're cut.

33:15

And so that birth stuff like.

33:18

Uh, llama 3.3, I think, with 70 billion models. And then upward, upward, you go with 400 billion

33:28

parameters and more. And these are obviously really good models, but who can run them on a GPU?

33:34

Almost nobody except you have tons of money. But I think now with deep-seag recently and

33:50

But I think now with deep-seagre recently and other models, people realize that maybe to stay competitive, you need to kind of go back to a lower size, to a smaller size,

34:00

to, yeah, to not waste resources also.

34:05

And I think that's cool because it also enables people

34:08

like us to just run these models ourselves.

34:11

- Yeah, well, so the current state of our testing

34:15

right now, just internally,

34:18

is like we're running small models, they're running fine.

34:24

The jury's out in terms of their practical usefulness,

34:28

but it's like I'm still unsure, you know,

34:33

it's too far as far as what makes sense for going

34:38

to like the home enthusiast, you know,

34:40

what makes sense for a home enthusiast to run.

34:44

Even skipping over all the next cloud infrastructure, right?

34:47

Just to the model itself.

34:49

Like what's practical and we're still playing around with it.

34:52

But, you know, obviously it's just interesting because it's new

34:56

and it's interesting to explore it as like what it could be.

35:00

So it doesn't mean it'll become something.

35:03

But for me, I guess what I would

35:05

like is something related to summarization. And I don't know how to do this with an LLM,

35:12

but it would be excellent if there was an ability to have, for example, a summary of

35:20

our call right now, which somehow added the relevant links and contextual links to the discussion

35:28

into the show notes because show notes as you could imagine that takes me probably even longer

35:35

than editing a lot of the times. Just writing out notes is really hard. It's really hard.

35:42

And it's not just finding, for example, your GitHub repositories, but also specific issues

35:49

or pull requests.

35:50

You have to look through the documentation and link to it or whatever.

35:54

It's complicated.

35:55

But that would be my personal intent.

35:57

I think the harder it is for you as a human, the more impossible it gets for an AI, I think. But yeah, it could be

36:09

done probably, I think, at some point. But I'm not sure if we're there yet to do links

36:17

for show notes automatically.

36:18

Okay. Yeah, that's just bad in trying to figure out personally. I mean, you could try to ask an LLM agent to look for links to the relevant show note

36:39

items with a search engine and maybe use the first link that pops up.

36:46

And bad, yeah, it's not sure if that is a good idea

36:50

because that's basically what a user can do also.

36:54

- Well, it's funny you should say that.

36:56

- If they hear you.

36:57

- There's actually is a project for that.

37:00

Right now it's called a perplexity.

37:03

And I believe it's a,

37:03

right now, it's called a perplexity. And I believe it's a,

37:06

I can't, or there's perplexity and perplexica.

37:10

And what it is is one of them is a hosted paid service

37:13

or whatever.

37:14

One of them is an open source tool.

37:16

And basically it runs in tandem to the self hosted

37:20

its search engine, CRX and G.

37:23

And we're playing around with it right now.

37:25

Right.

37:26

But that's what it's doing. It's doing searches. We're still early days and figuring it out,

37:32

so it's an ongoing thing.

37:33

But yeah, if you could hook up your summary stuff with something like perplexity or perplexica

37:42

that could already do some heavy lifting for your shadows.

37:47

But yeah, I think not sure how good it's going to be.

37:50

I don't know. And then the other thing is it does like raise a question in my mind in regards to like

37:56

misuse of these kinds of tools, right? Because in the end, you're kind of just doing scraping,

38:01

right? You're just pulling through people's content, which is, that's, I think, a part of the problem with AI

38:09

as far as I understand it is just AI is just trained

38:11

on everyone's data no matter whether they wanted it

38:14

to be or not, it just is.

38:16

And that is something that gives me pause, if that makes sense.

38:22

- Definitely, yeah.

38:23

- Yeah.

38:31

It's questionable. Yeah. Like you've seen, for example, Facebook using all these illegal books, torrent it from God knows where, and they just train

38:37

their data, their models on it. Yeah. Yeah. I'm not worried about it in terms of myself.

38:44

It just it like the ramifications more broadly, which are happening either way or, you know,

38:49

raise my eyebrows basically.

38:51

I'm like, uh oh, it makes me wonder about the future and the present.

38:55

Yeah.

38:56

It's also for text, I'm sort of, I find it quite, I'm kind of decided that I think for text that should be okay to harvest text data and train a model on it.

39:18

As long as the data is openly available.

39:30

available. But for images, I'm more on the fence, to be honest, because it's it seems I'm not sure if it's if it objectively makes sense, but it seems more delicate. Also, because a lot of these with tea often. So yeah.

39:46

All right.

39:53

Well, that's all the questions I had for you. I just want to say I appreciate you

39:58

talking to me. So thank you. Yeah, thank you for having me.

40:03

Yeah. Thank you so much. They will do it again. Definitely. I'm up for it. Yeah. I realize one thing we are missing, which I'll add to the beginning,

40:06

but I want to say that I'm happy to have you on, Marcel.

40:11

Maybe let's just give a quick recap of you as someone who is involved and feel free to fill in

40:18

the blanks of this works on next cloud bookmarks and next cloud floccus, I believe in your spare time.

40:25

And then you also work for the company.

40:28

And I think integrations related to your recognize

40:32

photo machine learning app.

40:35

And what else do you do, your musician?

40:37

What else is going on?

40:38

- What else is going on?

40:39

I'm a hobbyist musician too.

40:41

Yeah, I try to write into the Internet sometimes with my thoughts.

40:48

But yeah, I geek around a bit in my spare time, but it's not much.

40:53

Not much more is happening.

40:55

And you're a photographer?

40:56

That's all.

40:59

I do photograph.

41:00

Yeah, I used to do that more often, but I've become more skeptical of photography as a hobby

41:10

because it seems more and more irrelevant if you can generate all sorts of pictures.

41:17

You think so?

41:18

Yeah, it's not only the...

41:22

It hasn't been because of the image generation breakthroughs in the last,

41:27

in the last few years, but also before that, it was more and more ubiquitous that everybody

41:36

could photograph anything and it would look great.

41:39

So yeah, it's become a bit of a inflation thing with me that I got kind of fed up with trying

41:47

to photograph things myself because I'm a bit, "Yeah, why would the world need a photograph

41:57

this thing now?" because it's probably been photographed 100 times already.

42:02

That's maybe a bit of a sad sentiment.

42:11

It's funny because I felt similar about a lot of things. I mean, you name it. In via podcasting or art or anything, there's so much of something out there that it feels like

42:16

what's the point of adding more. Yeah. It's also accessible. You can,

42:23

And it's also accessible. You know, you can with the tip with the, at your fingertips, you have Google photo search

42:29

and you can get a photo in just a second.

42:33

Why bother making one yourself?

42:36

Yeah, but at this.

42:37

Yeah, go ahead.

42:38

Do the counter argument.

42:40

Well, the truth is, I mean, this is just for me as a human being, you know, nothing's

42:46

more meaningful to me than the photo that the person took. You know, if somebody takes a photo,

42:53

my niece or something, or draws a picture and gives it to me, that is the most valuable thing.

42:59

And I still find that, you know, unless the person's heart is like a stone, people are just going

43:05

to love something like that, you know, like few give them something that you, whether

43:10

however you did it, if you made something for them, you didn't just order it on Amazon.

43:14

Like you actually made something and you give it to them.

43:16

People remember that forever.

43:18

Yeah, that's a good point.

43:20

And I think there's, it reminds me of a recent thought I had because for photography,

43:28

I think an often heard sentiment back that when it was invented was also, well, you didn't make

43:34

that yourself, you just pressed the button. And it's kind of interesting to see that nowadays,

43:40

I don't know, maybe it's become more of a thing to accept.

43:45

Yeah, you made that, you did the photography.

43:48

So I'm wondering if at some point we'll be like this

43:52

for AI generated art.

43:54

Yeah, you prompted it, cool.

43:57

(laughing)

44:00

I'm not sure if that would be worthwhile

44:03

or worth living in that world, but it's just a thought.

44:07

All right. Well, thank you so much for coming on and doing a little interview with me. I really

44:13

appreciate it. And I'm going to close out here. Is there any last things you want to say? Just

44:18

live your life, folks, and have fun. Great. Thank you so much. We'll close it there.