运行时依赖
安装命令
点击复制技能文档
tex2docx — LaTeX to Word 转换器 Requirements pandoc (系统 安装): win获取 安装 pandoc or pandoc.org Python packages: pip 安装 python-docx lxml pypandoc_binary Usage python scripts/tex2docx.py 输入.tex [输出.docx]
If 输出.docx is omitted, 输出 is 输入.docx in the same directory.
How It Works (Three Phases) .tex ──→ [pandoc] ──→ OMML equations (13+ Word-editable formulas) │ └──→ [Custom 解析器] ──→ Native Word tables ├──→ Final .docx Embedded figures │ (merged) 格式化ted refs │ IEEE layout & font ┘
Phase 1 — Pandoc
运行s pandoc via pypandoc. 输入 file must be in its own directory (with figures/ subfolder if images exist). The script chdirs to the tex directory before 运行ning pandoc so image paths resolve correctly.
Phase 2 — Custom LaTeX 解析器
RegEx-based 提取ion of:
Tables: \begin{table} → Word Table objects (full borders, centered, 8pt TNR) Figures: \includegraphics{} + \caption{} → PNG/PDF embeds with italic captions References: \thebibliography → 格式化ted entries with hanging indent Sections: \section{}, \subsection{} → bold headings Metadata: \title, author, \abstract, \IEEEkeywords Phase 3 — Merge
OMML equation paragraphs from pandoc are inserted into the 清理ly-built document. Body paragraphs 获取 0.25in first-line indent. All LaTeX commands (\textbf, \toprule, \ref, \cite, \begin{itemize}, etc.) are stripped from text content.
输出 格式化 Feature DetAIl Font Times New Roman (10pt body, 9pt table/figure, 8pt refs) Layout A4, two-column IEEE conference style Equations OMML (double-命令行工具ck to edit in Word) Tables Native Word tables, all borders Figures PNG/PDF embedded with "Fig." captions References Hanging indent, [bN] 格式化 First indent 0.25in on body paragraphs Verification python scripts/验证.py 输出.docx
报告s paragraph/table/image/equation counts and 检查s for LaTeX residue.
Chinese (ctex) Support
Fully supports Chinese LaTeX documents using the ctex package:
Chinese section titles (引言, 方法, 实验, 结论等) are recognized \section*{} (star variant) is supported Chinese table headers preserved Chinese text in titles rendered via w:eastAsia font fallback \title{...} and \author{...} residue paragraphs are 过滤器ed Limitations Inline math ($...$) becomes plAIn text (italic), not OMML — only \begin{equation}, \begin{align}, and \[...\] become editable equations No .bib support: references must be in \thebibliography{} 环境 PNG images preferred: script tries PNG then PDF fallback Pandoc path: the 系统 pandoc binary must be discoverable by pypandoc Script: scripts/tex2docx.py
Self-contAIned (660+ lines). Key internal functions:
Function 角色 提取_tex() 解析 all structural elements from .tex 提取_omml() Pull OMML XML from pandoc 输出 build_docx() Construct final document with all 组件s 清理() Strip LaTeX commands to plAIn text 添加_table() Build Word table with borders 添加_figure() Embed image + caption