NOT a furry tiago.zip

datasets

these datasets may contain personal or sensitive information. by downloading or using them, you agree to comply with all applicable local data protection laws. you may not republish, sell, share or train genai models with the data. we are not responsible for any consequences arising from the use of the data.

twitter

collection of 600M+ tweets and 50M profiles scraped from Twitter during 2025 and 2026. scrape ongoing

mastodon.parquet

hundreds of millions of posts and profiles scraped from major mastodon instances during 2026. scrape ongoing

15+ GB

bsky.duckdb

dozens of millions of posts and profiles scraped from bluesky's firehose and API during 2026. scrape ongoing

twitter-typeahead.db

scraped typeahead twitter data during mid 2025, includes 186k topics and 78M users

linktree.db

over 14M linktree profiles, including links, bio, and social media pages

manifold.zip

data dumps from manifold markets's internal Supabase API, a prediction markets platform, in the form of JSONL files.

5.0 GB