文档

什么是中文档案馆？

中文档案馆是一个专注于中文互联网历史内容的元数据搜索引擎。我们通过索引来自互联网档案馆和其他来源的存档快照，帮助用户发现和探索中文互联网的历史记忆。

数据来源

我们的主要数据来源包括：

主要来源

Internet Archive (互联网档案馆)

The Wayback Machine archives websites from around the world, including Chinese websites. We use their public APIs to discover and index archived content.

archive.org

辅助来源（仅元数据）

Tieba & Douban (贴吧 & 豆瓣)

We index metadata (titles, timestamps, tags) from archived snapshots of these platforms. We do NOT store full content or scrape live pages.

tieba.baidu.com douban.com

仅作参考

Baidu (百度)

Baidu links are provided as outbound references only. We do NOT scrape, iframe, or store any content from Baidu.

我们不存储什么

我们严格遵守版权规定，只索引和存储元数据，不存储任何受版权保护的完整内容。

✕完整的帖子内容
✕完整的评论或回复
✕用户的私人信息
✕受版权保护的多媒体内容

What We DO Store:

✓Title (标题)
✓Short snippet/description (简短摘要)
✓Year/Date (年份/日期)
✓Source identifier (来源标识)
✓Archive URL (存档链接)
✓Tags/Categories (标签/分类)

法律与伦理立场

中文档案馆致力于合法、合规、负责任地运营。我们尊重内容创作者的权利，遵守相关法律法规，并采取措施保护用户隐私。

Our Commitments:

• We only index publicly available archived content
• We respect robots.txt and rate limits
• We do not store copyrighted full-text content
• We provide links to original/archived sources
• We respond to takedown requests promptly

如何提交内容

如果您发现重要的历史内容缺失，或者有值得保存的互联网历史记忆，欢迎联系我们进行提交。

Ways to Contribute:

1.Submit URLs to the Wayback Machine at archive.org/web
2.Open an issue on our GitHub repository with details about missing content
3.Contribute to our open-source ingestion scripts

API Reference

GET/api/search

Search archived content by query with optional filters.

?q=查询词&limit=20&offset=0&source=archive_org&year=2010

POST/api/ai/search

AI-enhanced search with query understanding and result summarization.

{ "query": "搜索词", "language": "zh", "include_summary": true }

POST/api/ingest/archive

Trigger ingestion from Archive.org (requires authentication).