# 文档处理
# Flag
- CSV文件处理 https://github.com/jqnatividad/qsv (opens new window)
- https://sourceforge.net/projects/opencsv (opens new window)
# yaml/json/xml/ini/conf/toml
- XML: Extensible Markup Language (XML)
- JSON: JavaScript Object Notation (JSON)
- YAML: YAML Ain't Markup Language (YAML)
- https://github.com/cbor (opens new window)
- BSON(Binary Serialized Document Format) https://bsonspec.org (opens new window)
- https://github.com/json-schema-org (opens new window)
- https://www.json.org/json-zh.html (opens new window)
- https://github.com/Cyphrme (opens new window)
- https://github.com/json-api/json-api (opens new window)
- https://github.com/garycourt/JSV (opens new window)
- https://github.com/toml-lang/toml (opens new window)
- https://github.com/yaml (opens new window)
- https://github.com/topics/env (opens new window)
- 序列化 https://github.com/protocolbuffers/protobuf (opens new window)
- https://github.com/google/flatbuffers (opens new window)
- https://github.com/msgpack (opens new window)
- https://github.com/apache/avro (opens new window)
- https://github.com/ebourg/hessian (opens new window)
- https://github.com/alipay/fury (opens new window)
数据及文件通常有三种类型:
- 配置文件型:如ini,conf,properties文件,适合存储简单变量和配置项,最多支持两层,不适合存储多层嵌套数据
- 表格矩阵型:如csv,excel等,适合于存储大量同类数据,不适合存储层级结构的数据
- 多层嵌套型:如XML,HTMl,JSON、YAML,TOML等,适合存储单条或少数多层嵌套数据,不适合存储大量数据
YAML兼容JSON格式,简洁,强大,灵活,可以很方便的构造层级数据
特殊符 | 说明 |
---|---|
\n\ | \n 换行,\后面字符串继续写 |
| | 文中自动换行,文末新增一空行 |
|+ | 文中自动换行,文末新增两空行 |
|- | 文中自动换行 ,文末不新增行 |
> | 文中不自动换行,文末新增一空行 |
>+ | 文中不自动换行,文末新增两空行 |
>- | 文中不自动换行,文末不新增行 |
XML
- https://sourceforge.net/projects/sax (opens new window)
- https://github.com/davidmegginson (opens new window)
- https://sourceforge.net/projects/rapidxml (opens new window)
可扩展标记语言(Extensible Markup Language (opens new window),简称:XML)是一种标记语言。
标记指计算机 (opens new window)所能理解的信息符号,通过此种标记,计算机 (opens new window)之间可以处理包含各种信息的文章等。
如何定义这些标记,既可以选择国际通用的标记语言,比如HTML (opens new window),也可以使用像XML这样由相关人士自由决定的标记语言,这就是语言的可扩展性。
XML是从标准通用标记语言(SGML) (opens new window)中简化修改出来的。 它主要用到的有可扩展标记语言、可扩展样式语言(XSL) (opens new window)、XBRL (opens new window)和XPath (opens new window)等
Simple API for XML(SAX) (opens new window)是个循序存取XML的解析器API,非W3C官方所提出的标准,它是事件驱动的(一种基于回调
callback
机制的程序运行方法)。 它是除了文档对象模型(DOM) (opens new window)的另外一种流行选择,DOM需要读入整个的XML文档,然后在内存中创建DOM树,生成DOM树上的每个Node对象
- https://github.com/topics/pdf (opens new window)
- https://github.com/topics/pdflib (opens new window)
- https://github.com/topics/pdf-viewer (opens new window)
- https://github.com/topics/poi (opens new window)
- https://github.com/topics/excel (opens new window)
- https://github.com/topics/word (opens new window)
- https://github.com/search?q=PDF+Reader (opens new window)
- PDF解析 https://github.com/zxyle/PDF-Explained (opens new window)
- 解锁 PDF 文件:使用 JavaScript 和 Canvas 渲染 PDF 内容 (opens new window)
- PDF查看器和工具包 http://www.xpdfreader.com (opens new window)
- PDF阅览器 https://fsfe.org/pdfreaders/pdfreaders.zh.html (opens new window)
- https://zh.pdf24.org (opens new window)
- PDF渲染 https://poppler.freedesktop.org (opens new window)
- https://www.mupdf.com (opens new window)
- https://github.com/bblanchon/pdfium-binaries (opens new window)
- https://github.com/wmjordan/PDFPatcher (opens new window)
- PDF操作 https://github.com/topics/pdfbox (opens new window)
- https://github.com/apache/pdfbox (opens new window)
- https://github.com/apache/tika (opens new window)
- https://github.com/itext (opens new window)
- https://github.com/flyingsaucerproject/flyingsaucer (opens new window)
- https://github.com/openhtmltopdf/openhtmltopdf (opens new window)
- https://github.com/LibrePDF/OpenPDF (opens new window)
- https://github.com/Frooodle/Stirling-PDF (opens new window)
- https://github.com/ofdrw/ofdrw (opens new window)
- https://github.com/MrRio/jsPDF (opens new window)
- https://github.com/mozilla/pdf.js (opens new window)
- https://github.com/Hopding/pdf-lib (opens new window)
- 读取PDF https://github.com/ledongthuc/pdf (opens new window)
- https://github.com/google/go-tika (opens new window)
- https://github.com/pdfcpu/pdfcpu (opens new window)
- 免费有水印 https://github.com/unidoc (opens new window)
- 创建PDF https://github.com/johnfercher/maroto (opens new window)
- https://github.com/jung-kurt/gofpdf (opens new window)
- https://github.com/tiechui1994/gopdf (opens new window)
- https://github.com/apache/xmlgraphics-fop (opens new window)
- https://github.com/apache/xmlgraphics-fop-pdf-images (opens new window)
- https://github.com/bpampuch/pdfmake (opens new window)
- python https://github.com/py-pdf/pypdf (opens new window)
- https://github.com/pymupdf/PyMuPDF (opens new window)
- https://hg.reportlab.com/hg-public (opens new window)
- https://github.com/pdfminer/pdfminer.six (opens new window)
- https://github.com/alephdata/pdflib (opens new window)
- https://github.com/pyx-project/pyx (opens new window)
- https://github.com/reingart/pyfpdf (opens new window)
- https://github.com/pmaupin/pdfrw (opens new window)
# Excel/Word
- https://github.com/rustytsuki/rust-office (opens new window)
- https://github.com/Api2Pdf (opens new window)
Go
- excel https://github.com/360EntSecGroup-Skylar/excelize (opens new window)
- https://github.com/qax-os/excelize (opens new window)
- https://github.com/shakinm/xlsReader (opens new window)
Java
- Excel https://github.com/apache/poi (opens new window)
- https://github.com/jxlsteam (opens new window)
- https://github.com/alibaba/easyexcel (opens new window)
- https://github.com/plutext/docx4j (opens new window)
- https://gitee.com/lemur/easypoi (opens new window)
- https://github.com/dhatim/fastexcel (opens new window)
- https://sourceforge.net/projects/jexcelapi (opens new window)
- https://github.com/aspose (opens new window)
- https://github.com/liaochong/myexcel (opens new window)
- https://github.com/monitorjbl/excel-streaming-reader (opens new window)
- https://github.com/jeecgboot/autopoi (opens new window)
- https://github.com/subtlelib/poi (opens new window)
- https://github.com/Thomaswoood/simple-excel (opens new window)
- https://github.com/ck-open/jumper (opens new window)
- https://github.com/crealytics/spark-excel (opens new window)
- https://github.com/pig-mesh/excel-spring-boot-starter (opens new window)
- 输出word https://github.com/Sayi/poi-tl (opens new window)
- https://github.com/MSPaintIDE/NewOCR (opens new window)
- 转换 https://github.com/documents4j/documents4j (opens new window)
- https://sourceforge.net/projects/csvjdbc (opens new window)
- https://github.com/opensagres (opens new window)
- https://github.com/jodconverter/jodconverter (opens new window)
- https://github.com/Saxonica (opens new window)
Python
- https://github.com/pyexcel (opens new window)
- https://github.com/jmcnamara/XlsxWriter (opens new window)
- https://foss.heptapod.net/openpyxl/openpyxl (opens new window)
JavaScript
- excel https://github.com/exceljs/exceljs (opens new window)
- https://github.com/dtjohnson/xlsx-populate (opens new window)
- https://github.com/SheetJS (opens new window)
- https://github.com/protobi/js-xlsx (opens new window)
- https://github.com/Ctrl-Ling/XLSX-Style-Utils (opens new window)
- https://github.com/skyrocks/x-xlsx-style (opens new window)
- https://github.com/duhaohao/xlsx-styleable (opens new window)
- https://github.com/LuisEnMarroquin/json-as-xlsx (opens new window)
- 使用js-xlsx简单实现一个导入excel (opens new window)
- 操作excel的js工具库 - XLSX的使用方法 (opens new window)
- https://github.com/d-band/better-xlsx (opens new window)
- https://github.com/dream-num (opens new window)
- https://github.com/ag-grid/ag-grid (opens new window)
- https://github.com/myliang/x-spreadsheet (opens new window)
- TableExport https://github.com/clarketm/TableExport (opens new window)
- tableExport.jquery.plugin https://github.com/hhurz/tableExport.jquery.plugin (opens new window)
- excellentexport https://github.com/jmaister/excellentexport (opens new window)
- https://github.com/jspreadsheet (opens new window)
- docx-preview https://github.com/VolodymyrBaydalka/docxjs (opens new window)
- 文档查看 https://github.com/webodf/ViewerJS (opens new window)
NodeJS