How AI & Machine Learning is Helping Real Estate Businesses Clean-up & Organize their Data

YouTube  Watch the full video here   |   Read the transcript below:   |   See the latest podcast here


When someone thinks of AI machine learning, you think of this scientist in a coat in front of a white board or chalkboard, and those algorithms are scary. I’ve seen you dissect a few, and they can be quite lengthy and intimidating. What I realized after running my last startup was that we realized that no matter how good your data science team is or no matter how good those PhD modeling people are, they can't do anything unless they can access the data.

The other surprising thing to me was--and doing your course and other courses too have only confirmed this further is something I've realized: it just takes a few lines of code, and you can implement an existing algorithm. Some of these algorithms are really powerful! You don't need to go reinvent the wheel; in fact, that's probably the worst thing you can do.

You literally might just want someone who can access the data, clean the data, and type a few lines of python code and boom! You've performed the work that previously humans couldn't do--even teams of humans couldn't get to the output a machine can get to.




I think in that case, what we are trying to achieve with the course seems to have been a success. We are in this era of open-source technology so yes, you are absolutely right. You probably shouldn't be writing new algorithms; you should be implementing what other people have already written for two main reasons:

  1. because it's more time efficient, and
  2. because there are some smart guys or teams of smart guys out there making sure it's correct and keeping it up to date and things like that, so you're right.

Get the data in a format that your team of people could do the applications for. Once that's all there and it's cleaned up, you can go. Getting the data together can be eighty or ninety percent of the time, and then running and optimizing the model is probably ten or twenty percent of the time. That remains true.


And I'd say there's a third thing too and that is, you must remember that there are teams of experts out there. Some are scientists and PhD’s and others are working at Google or Facebook, and they're producing powerful algorithms with lots of resources that are being open sourced--the community is very powerful. I can't really think of any instance where it makes sense to go and hire a PhD-level researcher and write algorithms and models. To me, I don't see many use cases in real estate or if there are, they are at the very edge. What are your thoughts?




One of the reasons that we did launch the course was because we are in this open-source era. You can download all these algorithms but for the sort of specialized, domain-specific code for running some things in real estate, we don't see any libraries or packages that you can directly import.

I don't know that you need a PhD-level researcher; just someone who understands what they're doing. They may have their masters or maybe they've just really geeked out about this for a long time and studied that. That's fine too, but you do need somebody who can convert some of this into a library of code that you can use. That's part of the stuff that we do in the course as well. We do share some other things that we've coded up. If you are trying to do something specific in the real estate space, there are a couple of cases in which there is something new.

There's a little bit of innovation … I’ll give you another example. There's a little bit of information about how we use, say, image recognition for real estate. We have companies that are looking into maybe grading the quality of a property (based on looking at the photos from a listing). We can think about companies like FoxyAI, Rhasspy AI and so on, so yes, there is a bunch of open-source stuff about image recognition. We can identify, say, a cat from thousands of photos, but it's a little bit different to answer this question in real estate. How do we determine the condition? Maybe there are very specific examples like that.

Another example (again with image recognition) is floor plan analysis. How do we automatically recognize, from a jpeg or pdf, the various rooms inside the floor plan or size? This seems to be still an unsolved problem. You may need it if you're trying to solve really specific cases like that, but I think, for the most part, that what people are trying to do doesn't necessarily require that. It requires somebody to take their existing algorithms and try to solve for them.


That focus on data, It's not a glamorous role either. It can be quite a boring role sometimes for many engineers. That's probably why it's been difficult to hire. In my last startup, hiring good data engineers was the most important thing we could do.

Also, companies tend to ‘kick the can down the road’ with this. When their architecture isn't in the right place, when the data integrity isn't being respected or it's being stored in different places, it's a lot more work to unchain everything and redo the architecture.

Sometimes that's what's needed before you can put any of that ML to use (garbage in, garbage out). If the data isn't clean or if the data isn't being even collected and stored properly, that's where you've got to focus, and that's not an easy job.


This is the PropTech VC podcast. We give you unique insights into how innovative technologies are disrupting real estate. We interview top entrepreneurs, investors and knowledgeable experts to share the inside scoop in this fast-moving industry. It's hosted by PropTech VC Zain Jaffer.


Back to Top  ↑