pebblebed ventures

Remember: In case of emergency, panic first, THEN follow protocol.

Datalab

Document Intelligence for enterprise AI
Datalab logo

Datalab trains specialized AI models for document intelligence, focusing on OCR, layout analysis, and PDF to markdown conversion. The company develops smaller foundation models (100-500M parameters) capable of transforming complex documents into machine-readable structured data at scale, while running on consumer-grade GPUs.

Their open-source models are state-of-the-art, easy to use, and have been adopted by hundreds of teams and researchers at leading institutions like Anthropic, Harvard, Stanford, and MIT.

Raised $3.5M seed round led by Pebblebed, with participation from Peak XV and angels including Balaji Srinivasan, Jeff Hammerbacher, and founding members of Hugging Face.

© 2025 Pebblebed · San Francisco, CA