Senior AI/ML Engineer, Search
Zark Lab is building foundation models for blockchain transactions and information. Our work focuses on search, retrieval, and generative modeling applied to on-chain and off-chain data. We are developing systems that enable efficient indexing, retrieval-augmented generation (RAG), and vector search for structured and unstructured blockchain datasets. Our models process billions of transactions and smart contracts across multiple blockchains, applying sequence modeling, graph-based learning, and language models to extract insights and improve data accessibility.
The team consists of former founders, senior engineers, and executives from Google, Meta, Goldman Sachs, and other leading technology and financial institutions. Our backgrounds span large-scale distributed systems, machine learning, engineering, and information retrieval, and we are focused on advancing the state of AI-driven search and computation for blockchains.
Candidate Profile
Build large-scale web scraping and ingestion pipelines for on-chain and off-chain blockchain data
Develop and optimize search architectures, integrating vector search, ANN retrieval, and ranking models
Fine-tune LLMs for query expansion, semantic search, and retrieval-augmented generation (RAG)
Reduce query latency through index optimization, ANN search, and distributed execution
Scale distributed indexing pipelines for efficient storage, deduplication, and retrieval
Optimize distributed storage and compute with Snowflake, ClickHouse, RocksDB, and vector databases
Build scalable systems to process high-throughput blockchain transactions and queries
Deploy and optimize cloud workloads on GCP with Kubernetes and containerized processing
Compensation & Perks
BS/MS/PhD in Computer Science or a related field
5+ years of experience in AI/ML, distributed search, or large-scale data processing
Strong programming skills in Python, TypeScript, or Node.js
Expertise in database design (SQL and NoSQL) and high-throughput data systems
Experience with web crawling, data scraping, and large-scale ingestion pipelines
Knowledge of vector search, retrieval-augmented generation (RAG), and embedding models (preferred)
Hands-on experience with GCP, Kubernetes, and Docker for cloud-scale deployment
Passion for blockchain, AI search, and distributed systems