الثلاثاء، 2 مارس 2021

Show HN: NNext – A feature vector datastore and nearest neighbor search service https://ift.tt/308JSBH

Show HN: NNext – A feature vector datastore and nearest neighbor search service Hi all. As a machine learning engineer, I mostly think in terms of feature vectors, embeddings, and matrices. Indeed one of the most useful byproducts of deep neural networks is embeddings because they allow us to represent high-dimensional data in terms of lower-dimensional latent vectors. These feature vectors can be used for downstream applications like similarly search, recommendation systems and near duplicate detection. As an ML engineer, I was frustrated by the lack of a datastore in which vectors are first-class citizens. As a result, most ML engineers, including myself, end up using awkward workarounds to store vectors such as arrays in SQL/NoSQL databases, stringifying vectors and storing them as text in in-memory-based caching systems such as Redis ETC. Furthermore, these systems don't allow for vector-based query operations such as nearest neighbor search. Consequently, engineers have to deploy additional approximate nearest neighbor search systems such as Facebook's FAISS or Spotify's ANNOY. These systems, while nifty and fast, are difficult to install and are costly to maintain. To address these issues, I built NNext, a managed vector datastore in which vectors are first class citizens. NNext, allows you to store vectors along with any json-blob metadata. Furthermore, NNext comes with a fast approximate nearest-neighbor (ANN) search capability. I would love to get feedback on your experience as Data Scientist or ML engineer storing feature vectors and ANN systems. Please shoot me an email at p@nnext.net. https://nnext.net March 3, 2021 at 04:22AM

ليست هناك تعليقات: