Acknowledgment

Huge thanks to the contributors who did the heavy lifting on this project:

Jet Chiang figured out the network visuals; and

Eric Xie who worked on the database; and

Paul Dong who worked on the frontend

MapMatch - A sematic-based matching algorithm w/ AWS

Read about how we cramped up a semi-working AWS-based roommate matching algorithm in 6 hours.

The AWS Hackathon

Oh my God, how are we supposed to make something work in 6 hours. DONT get me started on the fact that we had to use AWS. I mean, I love AWS, but it's a pain to set up.

The Algorithm

We used a simple algorithm that takes in a description of a person and then matches them with another person based on the similarity of their descriptions. We used AWS to host the algorithm and the database. We use AWS lambda to obtain the semantic embeddings of the descriptions and then we compare the embeddings to find the most similar descriptions. We save the user data on AWS DynamoDB and then host the server on, you guessed it, AWS.

Embeddings -> Similarity

The embeddings are generated using the AWS titan service. They are directions in a high-dimensional space that represent the meaning of the descriptions. We then compare the embeddings to find the most similar descriptions. We use the cosine similarity to compare the embeddings:

$$\text{similarity} = \frac{A \cdot B}{||A|| \cdot ||B||}$$

where $$A$$ and $$B$$ are the embeddings of the descriptions. We then match the descriptions with the highest similarity.

Graphical Projection

My friend, Jet, figured out how to project the embeddings onto a 2D plane. We used this to visualize the embeddings and see how the descriptions are clustered. Check out his blog post for more details.

Conclusion

We managed to get a semi-working algorithm in 6 hours. Apart from all that im grateful for blah and blah, I hope to work on this project more in the future. I think my teammates are pretty cracked and cool!

Also, we won, so I guess that's cool. (definitely not flexing)