Homesareexpensive.com – Accomplishments, Disappointments, And Lessons Learned

screenshot
The county view I’ve been working on, but Smack wanted less harsh colors

Smack posted this on r/dataisbeautiful about two months ago; it had the same URL but he wanted an alternative to “box view.” Box view looks quite a bit different.

screenshot
One of the mods complained that their had to be a screenshot, so he included that

The units here are per $1000. There was no relative coloring, but the yellow/red cutoffs were always configurable. His defaults were $450,000 and $550,000, if I remember correctly (hence the large amount of green).

I had hoped for a big announcement post for version 2.0, but it’s not ready yet. If you go to the website right now, county coloring isn’t added yet and the load time for all shapes (dots/boxes) is really slow. I still consider it usable – the user has to know to wait a minute whenever moving anywhere – but we don’t agree on that.

Overview

Thanks to what Smack called “The Big Scrape,” there is now data available for the entire United States. It…takes a minute, thanks to what I added and the additional data.

screenshot
These boxes aren’t the prettiest, but they automatically adjust depending on how far you zoom and where you zoom to

As Smack already revealed in the Reddit post, this website uses React/Openlayers for the frontend, and Flask for the backend (but he is transitioning away from Python). It used a 1 core Hostinger VPS that held up surprisingly well against a night of heavy traffic; I have upheld the 1-core tradition. I gave Smack access and temporarily own the host; Smack, in turn, owns the “homesareexpensive.com” domain.

I added relative coloring; Smack did everything prior. I worked on county coloring, which isn’t done.

Lessons Learned

Use SSH keys

Maybe the single most interesting thing I learned in all of this is that websites like this one are constantly brute-force probed (is “attacked” the right word?), maybe not even by humans. Fortunately I did not have to learn this the hard way because Smack immediately told me to disable password login and give him a public key. To prove a point, he recommended I monitor the log file just to see all the attempts to log into the server.

Avoid artificial intelligence if it is paired with human stupidity

This could be its own blog post (or three), but Smack said AI-assited coding was encouraged (I don’t remember if he ever used the term “vibe code,” which I now realize is a different thing). That being said, it would probably fail with OpenLayers.

Actually, what it gave me worked but hurt the performance in a way I had not anticipated. The backend was a different story and will probably take some time to unpack – initially it was giving this bad geojson file I had to manually update. It also kept trying to convert polygons into linestrings for some reason, but not in a way that was obvious or easy to correct.

…And if I remember correctly, it’s not like AI just automatically worked, and from there it was just tweaking. AI would fail, fail again, then work a little bit by messing up something else. Weeks later, I would find about an issue I had not realized AI caused (additional API calls). That kind of leads to the next point

Stress Test

I was so happy when I figured out how to set up the site on a local environment that I failed to realize the dangers of a “simplified” workflow. Instead of copying ALL the data, I just copied a really small subset of the data onto my local environment.

As a result, my features seemed to run very smoothly. I should have tested with a data set that was the same size of larger than what the site would really be working with.

Disappointments

I thought it would be done by now

Even blogging about it now is jumping the gun, but no one seems to be following this series. If anyone is, just…wait a minute for the data to load, I guess.

Ownership

I think we both had hoped for a 50%/50% split, with me developing my own features once the project was handed off. As it stands, it’s still definitely not 50% mine and I still needed to ask a lot of questions. A lot of this has to do with performance. The initial version was fast enough to not get dumped on by Reddit, but this?

Accomplishments

I can now say I was part of a project with lots of actual users, albeit coming in after a lot of the users had moved on.

All things considered, it was a nice exercise for me to work on a map project that I could actually talk about openly (especially considering I have blogged very little about maps in the past). I was using some familiar file types. I was messing around with smaller data sets in CodePen. For the first time ever in what seems like forever, or maybe the first time ever, I was touching something at least tangential to data science.

Lastly, whether Smack likes it or not, this is probably going to be a reasonable source of blog content. I am already realizing how much I miss Medium (which are words I never thought I would write).

Maybe in the next blog post I will figure out how to include actual code samples.

Leave a Comment

Your email address will not be published. Required fields are marked *