CIO Exchange Podcast

The Current State of AI in the Enterprise

Episode Summary

What mindset shift should CIOs and CTOs make in order to succeed in an AI-driven world? How can they take control of their destinies and future proof to ensure responsible AI use in their organization? In this episode, Yadin sits down with Jeff Boudier, who is responsible for Product and Growth at Hugging Face, to discuss in depth.

Episode Notes

---------

Key Quotes:

“AI is becoming the default way to build technology and most technology is going to be running some machine learning models in the background. Those models are going to be running everywhere from cloud to data center all the way to your pocket.”

“At the end of the day, what I'm really looking forward to is for every single company in the world to be able to build and own their own models.”

“Open models, open datasets and open source AI are really the only way forward for enterprises. If they want to be future-proof, in terms of auditability, in terms of regulation, in terms of compliance, then it's about being in control of your own destiny, right? Because AI is so key to everything you're going to be offering to customers.”

---------

Timestamps:

(01:46) The start of Hugging Face seven years ago

(02:41) What was machine learning like at that time?

(03:49) Teaching a machine to be friendly

(04:46) The shift that resulted from the seminal paper “Attention Is All You Need”

(06:20) The shift from writing an application to finding a model

(07:18) The increase in productivity from machine learning

(08:21) Ethical use of AI

(09:55) Are enterprises ready to think through complex AI questions?

(11:08) How CTOs are moving to a “model mindset”

(12:20) Evaluating models

(14:40) Offering end-to-end solutions

(16:13) Using a first principles approach

(20:49) Where will AI run?

--------

Links:

Jeff Boudier on LinkedIn

CIO Exchange on Twitter

Yadin Porter de León on Twitter

[Subscribe to the Podcast]
On Apple Podcast
For more podcasts, video and in-depth research go to https://www.vmware.com/cio

Episode Transcription

[music]

0:00:01.3 Jeff Boudier: It's not just trust, it's about ethical use of AI. And when you let a model make a decision on behalf of a human that is going to affect a human, then you might get into trouble.

0:00:15.4 Yadin Porter De León: Welcome to the CIO Exchange podcast. Where we talk about what's working, what's not, and what's next. I'm Yadin Porter de León. Ensuring the responsible use of AI at scale at a large organization is a challenge that tech leaders are working to solve even ahead of anticipated rules and regulations. What kind of mindset shifts do CIOs and CTOs and other technology leaders need to make to successfully future proof their AI efforts? Should they build their own proprietary models from open source or consume closed models like GPT or CLOG? In this episode, I speak with Jeff Boudier, who is in charge of product and growth at Hugging Face. Now, Hugging Face is a French American company based in New York that develops tools for building applications using machine learning. It is most notable for its open source Transformers library built for natural language processing applications, and its platform that allows users to share machine learning models and data sets and showcase their work. Currently they have over 300,000 of these models and have become basically the GitHub for AI. During our conversation, we discussed the state of AI in the enterprise in depth based on his experiences working with some of the largest companies in the world. We also spent a little time talking about the history and evolution of machine learning and the future of democratization. So totally for me personally, Jeff, Hugging Face, where did the name come from? Who came up with that?

[laughter]

0:01:32.3 Jeff Boudier: It's not a first time I get that a question.

0:01:34.9 Yadin Porter De León: I know, I know. You're like, I've heard, I've heard you. I think other people answer it. I don't think I've heard you answer it though, so I would love to hear what's your story for the Hugging Face?

0:01:45.5 Jeff Boudier: Yes. Well, our founders, Clément Delangue, Julien Chaumond, and Thomas Wolf really wanted to tackle the hardest problem in AI at the time. And that problem was large language models and make them believable and make AI friendly. And the first product of Hugging Face was actually a mobile application that you could use to converse with like your AI buddy.

0:02:13.0 Yadin Porter De León: The AI buddy. First of all, what was it? Why was it so tough? What was the problem that they were trying to solve in the beginning too? 'Cause there was, it you was said it was hard.

0:02:16.7 Jeff Boudier: It was hard because before the advent of the Transformers architecture and the initial BERT model, natural language processing didn't really work that well. And it was very hard before the first GPT model came along for models to really output interesting Cogan text. Yeah.

0:02:40.8 Yadin Porter De León: And so what was it outputting before? Let's say, give an example of what... This was a problem. Here's what the machine learning model was spitting out and this is why it was totally useless. I mean, give me one example of that. So.

0:02:48.9 Jeff Boudier: Hugging Face was started seven years ago and seven years ago there was no GPT, there was no BERT. Yeah. And the first GPT models could maybe predict the next word in the sense...

0:03:01.1 Yadin Porter De León: Like one word?

0:03:02.4 Jeff Boudier: One word. But if you wanted to extend that to like something that made sense, it really turned into gibberish real quick. That was a huge scientific sort of challenge and one that they were very excited about going to solve. And the fun thing was also to try to make AI friendly. And at the time, like AI was very much a niche scientific subject. The scientific conferences around AI didn't draw the crowds that they draw today.

0:03:29.0 Yadin Porter De León: Yeah. It's a big shift too. And, and when you say it's interesting you say the word friendly and that's a big thing from a scientific point of view, from a problem standpoint. Friendly is a really big deal. Why don't you just give us a sense of how big a deal of it is to make it friendly. It seems simple, but from a, just a human perspective, like we know how to be friendly, but how do you teach a machine how to be friendly?

0:03:49.4 Jeff Boudier: At the time it required very, very complex system. It wasn't like today we have one giant large language model that can just converse exactly the way you want it to and you instruct it to, like, back then you needed to have like a dozen different models to try to understand what question you were asking and what was the tone, what was the sentiment and what kind of category of question it was so that you can try to come up with an answer that sort of made sense.

0:04:18.8 Yadin Porter De León: That sort of made sense. That doesn't seem to be like really solving a big enterprise or business problem or even a niche problem from an academic standpoint. It's like kind of made sense. I can imagine like a paper dropping and it's like great, the new language model that kind of makes sense. [laughter]

0:04:32.7 Jeff Boudier: And it turned out that attention was all you needed. And that's a reference to the seminal paper that came out from Google Brain. Attention is all you need. That sort of introduced the idea of the transformer architecture.

0:04:46.4 Yadin Porter De León: Yeah. It was a big shift.

0:04:47.9 Jeff Boudier: In the last four years. Oh yeah. It was a tremendous shift. It wasn't very clear at the time. And I don't, I don't even think that the authors of the papers really saw the significance of that because that innovation became soon applicable to many, many more contexts. Many, many more domains that they initially thought about it. The paper was about a translation improvement. And now the transformer architecture is really the core of all the things that you use today that use machine learning.

0:05:21.6 Yadin Porter De León: Yeah. And they didn't know what they'd created at the time.

0:05:25.2 Jeff Boudier: I mean they had some sense that there was some broad applicability of what they discovered. But I mean today, everything that we use today that uses AI uses a transformer model. So we are writing an email and the mail app tries to complete our sentence. That's a transformer model.

0:05:42.5 Yadin Porter De León: Exactly. And then you just swipe right and it finishes a sentence for you. And so we're all gonna be like talking the same basically back and forth. Yeah. 'cause we just let AI write our emails for us.

0:05:52.1 Jeff Boudier: Even when we drive our car or the car drives for us. Now that's a transformer model too, right? The Tesla Autopilot.

0:05:55.2 Yadin Porter De León: Yeah. And so now that we've got a lot of this attention, we've got everyone believing, maybe not everyone, but a good chunk of, let's say technology leaders believing that there's tremendous value and there's a lot of hype and there's a lot of things and people have fear and doubt and all this cloud of uncertainty with regarding the potential for these things. From an enterprise perspective, what's the biggest problem that you're seeing? 'Cause you talk to a lot of customers, your team talks to a lot of customers. What's the perspective that these big companies are having when they're looking at trying to change their mindset from, I'm gonna rate an application to solve this problem till I'm gonna find a model to solve this problem.

0:06:33.7 Jeff Boudier: That's the biggest shift. And we're really at the beginning stages of it.

0:06:40.7 Yadin Porter De León: Like how early, like is there a company right now that's looking at this, that's listening to this right now? Are they late? Are they at pretty much the same place as every other company?

0:06:48.1 Jeff Boudier: They're not late. It's really, really just the beginning. Lots of large companies have had data science departments for a while that have been exploring, experimenting with like all the new models that come out. But the new idea is really to use machine learning to build any new feature for your application, any new service that deals with data, that's a new thing. There are a lot of doubts and uncertainties, but one thing is really, really clear is that the productivity gain when you are able to really use machine learning to assist what you're doing are super important. So talking about co-generation, right? Software developer, productivity increases by north of 25%, right? I think GitHub copilot had a study about that.

0:07:34.1 Yadin Porter De León: 25%. And that almost seems kind of low right now, increasing productivity by 25%. I think a big part is that because humans still have to be involved in that. So it's not like 200%, 300%, it's still 25% because you still have real people that need to be involved in the process.

0:07:46.4 Jeff Boudier: 25% is the lower end of the margin for that study. But yes, that's a very good point because all these technologies are meant to assist us when we get into trouble is when we let the models make decisions for us.

0:08:01.2 Yadin Porter De León: I think in general, in life, when you make other people or other things are, give them the power to make decisions for you, you start getting into trouble and these things. Now I think that the big part of that is trust and we can talk about that, but I mean, finish your thought. But I think trust is, I think a big part of what you're talking about.

0:08:14.5 Jeff Boudier: It's not just trust. It's about ethical use of AI. And when you let a model make a decision on behalf of a human that is going to affect a human, then you might get into trouble. Because these models can be applied into so many things. These models can be applied to set your insurance policy price. These models can be used to decide who gets to enter the building or whatever it is.

0:08:42.5 Yadin Porter De León: Who gets an insurance policy?

0:08:42.7 Jeff Boudier: Exactly.

0:08:46.7 Yadin Porter De León: Who gets into college, who gets this job or who gets at least a, a response from a recruiter? I can go super deep into that one. [laughter]

0:08:50.8 Jeff Boudier: That's right. And so that's why it's super important to make sure that the way that AI is being used follows some ethical guidelines. And one very important principle is to not let an AI make decisions for humans that's going to affect humans without any human in the loop.

0:09:10.7 Yadin Porter De León: Yeah. And I think the big shift that we're talking about here in the way that enterprises are approaching these things is you'd never had an enterprise, not, I shouldn't say never, but most of your enterprise applications that are within your portfolio, they're not subject to a lot of these questions. If you deploy, let's say a grammar checking tool maybe, or a word processing tool or a presentation creation tool, you're not looking at the ethical implications of Microsoft Word. You're just not. It's a word processing thing. You type with it, it's got some functions, it's got some features and usually you see it that way. But when you have something like you said that can create, that can appear to ideate, that can then also make decisions on your behalf. Now you're entering a completely different territory and you have conversations with a lot of companies. Do you feel like they're built for that now? Or do you feel like there's a lot of work that needs to be done there?

0:09:57.7 Jeff Boudier: There's a lot of work that needs to be done. And it's not exactly clear how, because the regulations around the use of AI are still being written. And so from a compliance security perspective, you want to make sure that you are using AI in a way that minimizes the potential harms, that minimizes the potential legal risks. And so take an example of a model that's going to be generating an email to respond to our customers. Like we know that large language models can hallucinate when they don't necessarily know how to respond.

0:10:35.6 Yadin Porter De León: They don't say.

0:10:36.5 Jeff Boudier: They don't say I don't know.

0:10:37.5 Yadin Porter De León: They don't say they give you something. I think what like Mark Andreessen I think said really, really well. He said like, they're like puppies. They wanna make you happy and they'll do what you ask them to do. Like a little puppy, would I wanna make you happy? Here you go. And it'll just do it. Go, go. You wanna go fetch that? Okay, I'll fetch it. Or Hey, write this thing about, how is Taylor Swift grew up in Guatemala and made clothes by hand? They'll write a story about that even though it never ever actually happened. And so it is, you have to then factor that into the way that you're doing your design thinking. When you're looking at how a model's gonna solve a problem. How are you seeing companies look at that, especially CIOs, CTOs, technology leaders looking at that problem and that concern when they're deciding to move from an application mindset to a model mindset when they're looking at delivering value and solving problems.

0:11:19.6 Jeff Boudier: I think one of the most important things is to be able to answer questions. And so today, if you're using a model as an API, meaning you are using an external service that offers a model, it's a closed source model, you don't know what it is, you don't know what it's trained on, and you don't know how big it is, and you're using that to power some features. If a question comes back to you about why did the model say this?

0:11:47.4 Yadin Porter De León: Why did it do that?

0:11:49.2 Jeff Boudier: If you get into trouble, then you are not really able to respond anything because you don't know what the training data was. You don't know anything really.

0:11:56.8 Yadin Porter De León: 'Cause a lot of these times these models may not necessarily, give me a little color here. 'Cause it's been said that sometimes when you're building this model, the people building the model don't know exactly why it's working. They just know if they throw enough data at it and they fine tune it and X, Y, and Z away, it does work and it works sufficiently well. Is that a good perspective or is that a bit of a misconception?

0:12:17.0 Jeff Boudier: So the way that we evaluate models is by using benchmarks where you're gonna throw at the model 1000s of different questions, see what comes back, and then sort of grade that. Is.

0:12:30.9 Yadin Porter De León: Is that... Were you referring to red teaming?

0:12:33.4 Jeff Boudier: So red teaming, no, red teaming is about poking at the model to trying to find failure modes, right? Which is a very important step in hardening models. So whether a model works or not is never a binary question, right? So the way that we measure that is by evaluating them through those benchmarks, throwing at them 1000s of different questions, seeing what comes back, seeing what makes sense. You do this on several benchmarks to compare models between them. And we actually have on Hugging Face some really, really used tools called the Open LLM Large Language Model leaderboard. It's free and it's available on the Hugging Face hub that continuously evaluates now over a 1000 different large language models on all these different benchmarks. So we can see right away which models are performing better on specific types of questions.

0:13:26.6 Yadin Porter De León: Yeah. And I mean the real big value of this sort of the AHA, like why hugging faces because you're able to iterate quickly not within like months, not within weeks, but within minutes and be able to start moving forward with these types of things.

0:13:39.1 Jeff Boudier: We've taking advantage of everything that Open Source AI offers. Ever since the release of the LLaMA model by Meta there's been this explosion of open large language models.

0:13:51.8 Yadin Porter De León: I know it's kinda this Cambrian explosions of all these different sort of projects and pseudo AI life forms. It was really interesting when once that comes out into the world, so it's out. So people say, well what happens when this gets out? It's already out. It's already out there. And so how do you do it responsibly, ethically, like you're talking about and how are you helping some of the companies grapple with some of those questions is okay, I believe you, I need to take a model approach instead of an app approach. What's the first step I need to start looking at when I'm starting to move my team, making sure my legal understands, making sure leadership understands and at the same time letting everyone know this is coming but we're gonna do it in a responsible way. Do you help them go through that?

0:14:33.1 Jeff Boudier: Yes. The way that we do that now is by offering solutions that are not about just about the model, it's about the whole end-to-end solution. So a great example of that is SafeCoder, which we announced today at VMware Explore. And what SafeCoder is, it takes the best open source model for code generation and provides the company, the enterprise with a complete solution to train their own model using open source libraries, using open models, and then deploy it within their own IT organization. So with SafeCoder, you can take advantage of all the benefits of open source. You have access to their the training data. Remember what we said earlier, the star coder model, which is the base for the SafeCoder solution, is trained upon the stack, which is an open data set that was built from the ground up for compliance, right? So it's only trained from code that is licensed under commercially permissible license. So you know you're not gonna get in there some stuff that shouldn't be in your code base...

0:15:44.3 Yadin Porter De León: Which is huge. It's absolutely huge because you need to be able to get basically like a nutrition label on your code and be able to say, here's all the ingredients, here's where all the different code came from. But if you can't answer where the code came from, that's a serious problem.

0:15:55.4 Jeff Boudier: That's right. And that's why for me, like open models, open data sets and open source AI are really the only way forward for enterprises if they want to be future proof in terms of auditability, in terms of regulation, in terms of compliance. And then it's about being in control of your own destiny, right? Because AI is so key to everything you are going to be offering to customers, new features, new products, new services, there's going to be AI features if AI itself is not the feature. So you wanna be able to control that and build up your skillset.

0:16:34.1 Yadin Porter De León: Yeah. And so you touched a couple different things there. One, how some organizations are a little bit intimidated because they think, okay, we're gonna have to do a really big proprietary high density stack where we need special cooling, we need special electricity, we need all these different things to be able to really do this at scale. But maybe you don't need to create a LLM that does everything. Maybe you need to create something that just does one thing really, really well. What's your view on how companies should approach creating something really big that's general purpose or creating something really, really special for just them and maybe they shouldn't be so afraid of the cost?

0:17:06.1 Jeff Boudier: Yeah, my view is that really companies need to think from first principles for every problem they want to apply AI to.

0:17:14.0 Yadin Porter De León: Yeah. And give them a sense of from first principles, give them just a general dose who haven't used first principles approach before. Say.

0:17:19.4 Jeff Boudier: Say you want to improve your search engine, your e-commerce website and people come online and they want to find a red sofa and you want that search to be smarter. So it's not just the listings that have red or sofa in the name that will show up. It's actually couches, magenta couches will also show up. So that's called semantic search. And to power that you can use large language models so that you can build this representation of every listing in your catalog in a way that can be retrieved semantically. And of course you can use like a gigantic large language model, something like GPT-3 or 4 to do that. But that would be really insane because we have these very, very efficient small models that are extremely good at this. And that can run on just a single CPU versus a GPT-4. Like nobody knows how big GPT-4 is, but it's probably a million dollars a year just to run one single instance.

0:18:25.0 Yadin Porter De León: I think that's a fantastic point. Because I think some people are looking at that gigantic large language model general purpose that runs on that big stack that's really expensive with being able to provide the value they need for the business when in fact it's not the case. Like you said, was it like a single GPO, graphic processing unit, now people are talking about TPUs, Tensor Processing Units and we'll see where that goes. But you don't have to create a big gigantic stack. You can run stuff on existing infrastructure, but you need to make sure that you're really specific and not general when you're creating the model.

0:18:57.1 Jeff Boudier: That's what I mean by going by first principles, right? So, okay, what's the best possible model options for this use case? What do they need to run? And sometimes the answer is like one CPU or server actually. So we have a service on Hugging Face for companies to deploy their own models, right? So they don't have to think about infrastructure like, okay, I like that model. I want an API endpoint to hit it. And it's interesting to see that actually the majority of these endpoints that are created by companies are running on CPU. You wouldn't think that when you sort of follow along the latest AI news, and it's not just inference, inference means like running the models, right, getting predictions out of the models. It's also true for training. So if you roll back to two, three years ago, like the common wisdom was that you needed 1000s of GPUs and 100s of 1000s, if not millions of dollars of compute if you wanted to create your own large language models.

0:19:53.3 Jeff Boudier: And today with new techniques like QLoRA or the parameter efficient fine tuning techniques, you can actually, PEFT with PEFT you can, you don't need to retrain the whole thing. You can actually be very selective in the part of the model that you're going to be retraining and get amazing results with only a few GPUs for a few hours. So that's really new and I think that's gonna really democratize the companies owning and building their own AI. And at the end of the day, what I'm really looking forward to is for every single company in the world to be able to build and own their own models.

0:20:29.4 Yadin Porter De León: Oh, that's fantastic. And I think one thing, one last thought to it was where do you think this, these are gonna run? Because people have the idea that this is gonna be so huge, it's gotta take up a ton of compute and storage and and you've gotta have these gigantic pieces of hardware to run it. If that's not the case, where do you think AI's gonna run? Is AI gonna run everywhere? Where do you think we get that where inference is happening all over the place?

0:20:50.3 Jeff Boudier: It is already to an extent running everywhere, right? And so I think the challenge that is being actively worked on, there's a ton of progress right now on inference on Edge, but by and large, like AI is what is driving today the increase of capacity at hyperscaler cloud providers in data centers at companies. And we see more and more ways to deploy those models directly on a local computer, even on a smartphone actually great demos of the Llama 2 large language model by meta running on a smartphone. There are great demos with Whisper. Whisper is a transcription from audio, speech to text. So I think AI is becoming the default way to build technology and most technology is going to be running some machine learning models in the background. And those models are going to be running everywhere from cloud to data center to all the way to your pocket.

[laughter]

0:21:54.7 Yadin Porter De León: Excellent. Excellent. Well this has been a great conversation. I could talk to you for like another couple hours, but I think we're gonna pause right here and tell listeners where can they find out more about you, about Hugging Face follow you on the internet, where can people know more about what you're doing?

0:22:07.4 Jeff Boudier: Of course. So the trick is Hugging Face is @huggingface.co.

0:22:10.5 Yadin Porter De León: .co. Okay.

0:22:14.0 Jeff Boudier: Huggingface.co. You can actually type just hf.co and you'll get right there. That's where you'll find 300,000 open and accessible models where you're gonna find 100s of 1000s of open data sets and demos for you to play with and interact with everything that it talks about.

0:22:30.4 Yadin Porter De León: And what about you, Jeff? Where are you?

0:22:35.8 Jeff Boudier: Oh, me. Oh, I'm just Jeff Boudier. I'm Jeff Boudier everywhere on Twitter, on GitHub.

0:22:36.2 Yadin Porter De León: Excellent.

0:22:37.1 Jeff Boudier: On LinkedIn, etcetera.

0:22:38.3 Yadin Porter De León: All right. We'll include all that stuff in the show notes. Well, Jeff, I really appreciate the conversation and thanks for joining the CIO Exchange podcast. Thank.

0:22:42.8 Jeff Boudier: You so much.

[music]

0:22:47.8 Yadin Porter De León: Thank you for listening to this latest episode. Please consider subscribing to the show on Apple Podcasts, Spotify, or wherever you get your podcasts. And for more insights from technology leaders as well as global research on key topics, visit VMware.com/cio.

[music]