Round Table Discussion Highlights 2: First Mile Problem - Data Harvesting

Updated: Aug 17


This article features the discussion highlights from 1st Fellows.fund Round Table Discussion took place on July 17th, 2021. Speakers include: Vijay K Narayanan, Daniel Kokotov, Lei Yang, Haixun Wang, Xuedong Huang, Anshul Pande, Gang Hua, Alex Ren


View Highlights 1: Enterprise AI - Startup's "Last Mile Challenge"


Microsoft Azure AI CTO Xuedong Huang at Round Table Discussion

Full video of Fellows Fund Summit 2021: Tech Entrepreneurship in the AI Era

Fireside chat between Eric Yuan and Xuedong Huang


First Mile Problem - Data Harvesting


Alex Ren

How about CV applications? Where are the potential opportunities?

Gang Hua

Digitizing the physical world using CV technology can make a killer app. It is going to make transformative innovation in a lot of traditional business domains. Think about IoT+CV. Camera is a unique IoT sensor indeed. It is very general and the information is dense. But you need to pair it with powerful AI in order to digest the information from the physical world

Chief Scientist at Blibee Gang Hua

and transform it into a digital form. Look into all other IoT sensors, they are just not that advanced because they usually can only fulfill one single task, however, camera sensors with CV can digest a broader set of information


Hence from my point of view, the general capability of digitizing the physical world, and linking the online with offline is where computer vision is going to play a critical role.


Daniel Kokotov

That sounds like huge opportunities, but the data and the heterogeneity of data haven’t been solved yet. Bringing things altogether or having insights that are across these data sources is hard.

Xuedong Huang

I completely agree with Dan. If you can find a way to harvest data, then you can actually sell that data to all the startups who need it.

Vijay K Narayanan

That's almost like the first main problem, accessing the data across data silos. This is an excellent point. Unfortunately, we’ve gotten into a situation where data in most enterprises are locked in a few proprietary ecosystems. I think there's a tremendous opportunity to bring above a layer of abstraction on top of that, where you're able to do AI, on data across all of these different data scientists. In fact, we've been looking at this very deeply since last year. My team and about 80-90% of our customers are solving the “first mile” problem. Once they saw that and start to put things together. It's not just something at once, because these data pipelines are continuous things. You will need to keep maintaining them, keep training them.


Haixun Wang

VP Engineering at Instacart Haixun Wang

Two sides of the story to think about. On the one hand, we do have lots of data like any e-commerce company. When you have a lot of user engagement data, you run into problems such as how do you effectively use your data. That is the problem many startups are facing now. People are talking about democratizing AI. There's Azure AI, Google AI, cloud AI, and Amazon Ai, but still, it takes a lot of effort for startups and enterprises to put all these things together. Why did I join Instacart? Or Why did I join WeWork? The same thing. I thought I was going to do machine learning and building models but ended up doing a lot of machine learning infrastructure work. It is what we call ML ops right now.


The other thing is, we want to create a very nice customer experience for users. But if you look at the current e-commerce early stage startups, like Instacart, or even Amazon, the experience really sucks. If you are in a physical store, you will get much more information than you can get from spend 10-20 minutes on a web interface. computer vision can let us actually see the products, instead of just one picture, or one icon.

Gang Hua

Yes, you brought up a very good point about visualizing the information. Several startups actually tried to build a virtual reality experience for customers during the pandemic. For example, take the 360-degree cameras hovering around the store, so people can look into remotely using VR glasses. I think that's still very different from if you showed the product offline.

Vijay K Narayanan

Yes. So how do you use that data more efficiently? Either take some of these pre-trained services and tune them very quickly? Or can you generate synthetic data that mimics the characteristics? There's a broad class of possible problems for startups to solve.

For startup pitches, I am already at a point that if any company starts with “I use transformers”, or I use whatever the latest contributions, I’ll say “excellent, forget it”. If you have the flavor today and in six months, you will see something else. Then what is your thing? So really, what's the problem you're solving, rather than a technology quest.


About five, six years ago, while we were working on a problem. A lot of startups came to us, 9 out of 10 said, we're going to build this machine learning tool for you. And 9out of those 10 companies don't exist today. Because you don't need 100 tools, you only need a few really good ones.



#ai #MicrosoftAzureAI #enterpriseai #startupopportunities #machinelearning #aiapplication


Join Fellows Fund club to receive more updates about AI Startup Investment Opportunities


196 views0 comments