Use this skill when the user wants to convert a thesis outline document (.docx) into a complete, publication-ready master's thesis. The output can be either .docx (using python-docx) or LaTeX format (default: LaTeX). This skill expands concise arrow-notation outlines into full academic prose following the NYCU (National Yang Ming Chiao Tung University) thesis format. Trigger whenever the user mentions 'generate thesis', 'expand outline to thesis', 'convert outline to thesis', 'thesis from outline', 'write thesis', or asks to produce a complete thesis document from an outline.
This skill converts a thesis outline document (.docx with arrow notation) into a complete, publication-ready master's thesis. It supports two output formats: LaTeX (default) and DOCX.
The transformation is based on observed patterns from a senior student's (Chris Huang) outline-to-thesis workflow at NYCU Institute of Computer and Communications Security.
| Task | Approach |
|---|---|
| Read outline .docx | python -m markitdown outline.docx or python-docx |
| Read reference PDF | pypdf or pdfplumber |
| Generate thesis .docx | Use python-docx (see docx skill) |
| Generate thesis .tex | Write LaTeX files directly |
| Output format | User specifies docx or latex; default is |
latexThe user can specify the desired output format. If not specified, default to LaTeX.
| User Says | Output Format |
|---|---|
| "output as docx" / "Word format" / "generate .docx" | DOCX (python-docx) |
| "output as latex" / "LaTeX format" / "generate .tex" | LaTeX |
| (nothing specified) | LaTeX (default) |
The thesis format is based on the NYCU master's thesis template, derived from Chris Huang's thesis: "Leveraging Reinforcement Learning for LLM-based Generic Honeypots with MITRE Engage Framework" (June 2025).
1. Title Pages (Chinese + English)
2. Chinese Abstract (摘要)
3. English Abstract
4. Acknowledgments (誌謝)
5. Table of Contents (目錄)
6. List of Figures (圖目錄)
7. List of Tables (表目錄)
8. Chapter 1: Introduction
9. Chapter 2: Background and Related Work
10. Chapter 3: Problem Formulation
11. Chapter 4: Solution Approach / Methodology
12. Chapter 5: Implementation
13. Chapter 6: Experiment Results / Evaluation
14. Chapter 7: Conclusion and Future Work
15. References
16. Appendix (optional)
Chinese Title Page:
國立陽明交通大學
[Institute Name]
碩士論文
[Chinese Thesis Title]
研究生:[Student Chinese Name]([Student English Name])
指導教授:[Advisor Chinese Name]([Advisor English Name])
中華民國[Year]年[Month]月
English Title Page:
[Institute Name]
National Yang Ming Chiao Tung University
Master Thesis
[English Thesis Title]
Student: [Student English Name]
Advisor: Dr. [Advisor English Name]
[Month] [Year]
Taiwan, Republic of China
The outline uses arrow notation (→) to chain key concepts concisely. The thesis expands each arrow chain into full academic paragraphs with:
| Outline Element | Thesis Expansion |
|---|---|
Arrow chain (A → B → C) | 1-3 paragraphs explaining the logical flow from A to B to C |
Section title with parenthetical purpose 1.1 (Explain X) | Section with heading, remove parenthetical, write full prose |
| Bullet points in Related Works | Structured paragraphs comparing approaches with citations |
| Notation Table | Formal table with Table X.Y numbering and caption |
| Problem Statement (Input/Output/Objective) | Formal mathematical formulation with equations |
| Figure references ("like figure X show...") | Figure X.Y with formal caption, cross-reference in text |
| Tool/Library tables | Formatted table with Table X.Y numbering |
[PLACEHOLDER] markers | Mark as [TODO: ...] for later completion |
Outline (arrow notation):
Honeypot → security tool → simulate real system → decoy hacker → analysis
attack method → prevent hacker attack → past research → simulate-type
honeypot → e.g., Cowrie, Kippo → partial command set → easy expose flaw
Thesis (full academic prose):
Honeypots are security tools designed to simulate real systems, acting as decoys
to attract and monitor malicious activities. By analyzing the behaviors and
techniques employed by attackers, defenders can gain valuable insights into
potential threats and develop more effective countermeasures. Over the years,
researchers have proposed a variety of honeypot implementations. One major
category is simulation-type honeypots, such as Cowrie [3] and Kippo [4], which
attempt to imitate system behavior using limited, predefined command sets.
While lightweight and easier to deploy, these systems are constrained in realism
and can be easily identified by experienced attackers.
Key patterns observed:
→ in the outline becomes a clause or sentencee.g., Cowrie, Kippo) become formal citationsGoal: Motivate the research problem, present contributions, and outline the thesis.
IMPORTANT: No section headings in Introduction. Chapter 1 should NOT have any numbered subsections (1.1, 1.2, etc.). Instead, the content should flow as continuous paragraphs under the chapter heading only.
Outline structure (typically 1.1–1.7 in the outline):
add_section_heading() calls in Chapter 1Expansion rules:
Typical Introduction structure in expanded thesis:
[Chapter heading only — no section numbers]
Paragraph 1: Background, motivation, attack lifecycle importance
Paragraph 2: Current detection limitations (alert-based paradigm)
Paragraph 3: Log analysis challenges (volume, complexity)
Paragraph 4: Proposed approach (two-stage: extraction + LLM)
Paragraph 5: Problem statement + contributions intro
- Bullet: Contribution 1
- Bullet: Contribution 2
Paragraph 6: Thesis organization
Goal: Provide technical foundation and position the work in literature.
Outline structure:
Expansion rules:
Table 2.X: Comparison of Related WorkBackground subsection template:
[Concept Name]
[Definition paragraph: What is X?]
[Importance paragraph: Why does X matter?]
[Technical details: How does X work? Include equations/figures if needed]
[Connection paragraph: How X relates to our approach]
Goal: Formally define the research problem with mathematical notation.
Expansion rules:
Table 3.1: Summary of NotationsGoal: Present the proposed method in detail.
Expansion rules:
Figure 4.1)Goal: Describe how the system was built and configured.
Expansion rules:
Table 5.1: Open-Source Tools and LibrariesGoal: Present experimental methodology, results, and analysis.
Expansion rules:
Table 6.X, Figure 6.X)Goal: Summarize contributions and propose future directions.
Expansion rules:
# Extract text content from the outline document
python -m markitdown paper_outline_v1.docx > outline_content.md
Or using python-docx:
from docx import Document
doc = Document("paper_outline_v1.docx")
for para in doc.paragraphs:
print(f"[{para.style.name}] {para.text}")
From the outline, extract:
Check user's specification:
For each chapter in the outline:
[X] or \cite{X} referencesCreate:
Generate a LaTeX project structure:
thesis_v1/
├── main.tex # Master document with \input{} for each chapter
├── chapters/
│ ├── abstract_zh.tex # Chinese abstract
│ ├── abstract_en.tex # English abstract
│ ├── chapter1.tex # Introduction
│ ├── chapter2.tex # Background and Related Work
│ ├── chapter3.tex # Problem Formulation
│ ├── chapter4.tex # Solution Approach
│ ├── chapter5.tex # Implementation
│ ├── chapter6.tex # Evaluation
│ └── chapter7.tex # Conclusion
├── figures/ # Figure files
├── tables/ # Standalone table files (optional)
├── references.bib # Bibliography
└── nycu-thesis.cls # NYCU thesis class (if available)
main.tex template:
\documentclass[12pt, a4paper]{report}
\usepackage[utf8]{inputenc}
\usepackage{amsmath, amssymb, amsthm}
\usepackage{graphicx}
\usepackage{booktabs}
\usepackage{hyperref}
\usepackage{algorithm}
\usepackage{algorithmic}
\usepackage{xeCJK} % For Chinese characters
\title{[Thesis Title]}
\author{[Student Name]}
\date{[Month Year]}
\begin{document}
% Title pages
\input{chapters/titlepage}
% Abstracts
\input{chapters/abstract_zh}
\input{chapters/abstract_en}
% Front matter
\tableofcontents
\listoffigures
\listoftables
% Main chapters
\input{chapters/chapter1}
\input{chapters/chapter2}
\input{chapters/chapter3}
\input{chapters/chapter4}
\input{chapters/chapter5}
\input{chapters/chapter6}
\input{chapters/chapter7}
% References
\bibliographystyle{IEEEtran}
\bibliography{references}
\end{document}
Use python-docx to generate the thesis document:
from docx import Document
from docx.shared import Pt, Inches, RGBColor
from docx.enum.text import WD_ALIGN_PARAGRAPH
doc = Document()
# Set default font
style = doc.styles['Normal']
font = style.font
font.name = 'Times New Roman'
font.size = Pt(12)
# Generate title pages, abstracts, TOC placeholder, chapters...
# (See detailed generation code below)
doc.save("thesis_v1.docx")
DOCX Generation Rules:
After generating:
[TODO] or [PLACEHOLDER] markersfrom docx import Document
from docx.shared import Pt, Inches, RGBColor, Cm
from docx.enum.text import WD_ALIGN_PARAGRAPH
from docx.enum.section import WD_ORIENT
from docx.oxml.ns import qn
def create_thesis_doc():
doc = Document()
# Set default style
style = doc.styles['Normal']
font = style.font
font.name = 'Times New Roman'
font.size = Pt(12)
paragraph_format = style.paragraph_format
paragraph_format.line_spacing = 1.5
# Set margins
for section in doc.sections:
section.top_margin = Cm(2.54)
section.bottom_margin = Cm(2.54)
section.left_margin = Cm(3.17)
section.right_margin = Cm(2.54)
return doc
def add_chapter_heading(doc, chapter_num, title):
"""Add a chapter heading with page break."""
doc.add_page_break()
heading = doc.add_heading(f'Chapter {chapter_num}', level=0)
for run in heading.runs:
run.font.size = Pt(24)
run.font.color.rgb = RGBColor(0, 0, 0)
subtitle = doc.add_heading(title, level=1)
for run in subtitle.runs:
run.font.size = Pt(18)
run.font.color.rgb = RGBColor(0, 0, 0)
def add_section(doc, section_num, title, content_paragraphs):
"""Add a section with multiple paragraphs."""
heading = doc.add_heading(f'{section_num} {title}', level=2)
for run in heading.runs:
run.font.color.rgb = RGBColor(0, 0, 0)
for para_text in content_paragraphs:
p = doc.add_paragraph(para_text)
p.paragraph_format.first_line_indent = Inches(0.5)
p.paragraph_format.space_after = Pt(6)
def add_figure(doc, image_path, caption, fig_num):
"""Add a centered figure with caption."""
p = doc.add_paragraph()
p.alignment = WD_ALIGN_PARAGRAPH.CENTER
run = p.add_run()
run.add_picture(image_path, width=Inches(5))
cap = doc.add_paragraph(f'Figure {fig_num}: {caption}')
cap.alignment = WD_ALIGN_PARAGRAPH.CENTER
cap.runs[0].font.size = Pt(10)
def add_table_with_caption(doc, headers, rows, caption, table_num):
"""Add a table with caption above."""
cap = doc.add_paragraph(f'Table {table_num}: {caption}')
cap.alignment = WD_ALIGN_PARAGRAPH.CENTER
cap.runs[0].font.size = Pt(10)
cap.paragraph_format.space_after = Pt(4)
table = doc.add_table(rows=1 + len(rows), cols=len(headers))
table.style = 'Table Grid'
# Header row
for i, h in enumerate(headers):
cell = table.rows[0].cells[i]
cell.text = h
for p in cell.paragraphs:
for r in p.runs:
r.font.bold = True
r.font.size = Pt(10)
# Data rows
for row_idx, row_data in enumerate(rows):
for col_idx, cell_text in enumerate(row_data):
cell = table.rows[row_idx + 1].cells[col_idx]
cell.text = str(cell_text)
for p in cell.paragraphs:
for r in p.runs:
r.font.size = Pt(10)
When generating LaTeX, follow these conventions:
% Chapter heading
\chapter{Introduction}
% Section with prose
\section{Background and Motivation}
Honeypots are security tools designed to simulate real systems, acting as decoys
to attract and monitor malicious activities~\cite{spitzner2003honeypots}.
By analyzing the behaviors and techniques employed by attackers, defenders can
gain valuable insights into potential threats and develop more effective
countermeasures.
% Figure
\begin{figure}[htbp]
\centering
\includegraphics[width=0.8\textwidth]{figures/architecture.pdf}
\caption{System architecture overview showing the two-stage pipeline.}
\label{fig:architecture}
\end{figure}
% Table
\begin{table}[htbp]
\centering
\caption{Summary of notations used in this thesis.}
\label{tab:notations}
\begin{tabular}{lll}
\toprule
\textbf{Category} & \textbf{Notation} & \textbf{Description} \\
\midrule
Alerts \& Logs & $A = \{a_1, a_2, \ldots\}$ & Alert set from EDR/NDR \\
\bottomrule
\end{tabular}
\end{table}
% Equation
\begin{equation}
\text{Recall Rate (RR)} = \frac{|\hat{L} \cap L^*|}{|L^*|}
\label{eq:recall}
\end{equation}
% Problem statement
\textbf{Problem 1 (Malicious Log Extraction).}
\textit{Given} a set of raw logs $L$ and a set of alerts $A$,
\textit{find} the subset of malicious logs $\hat{L} \subseteq L$ such that
the recall rate is maximized while minimizing the false positive rate.
| ❌ Informal (Outline) | ✅ Formal (Thesis) |
|---|---|
| "LLM → trained on vast text" | "Large Language Models are trained on extensive text corpora" |
| "can't decoy hacker dynamic" | "cannot dynamically adapt deception strategies to evolving attacker behaviors" |
| "garbage in, garbage out" | "the quality of downstream analysis is inherently bounded by the quality of its input" |
| "over half techniques missed" | "more than 50% of attack techniques remain undetected" |
| "need automated solution" | "an automated approach is required to address the scalability challenge" |
Every paragraph should follow this pattern:
⚠️ CRITICAL: Never hallucinate citations.
[1], [6], [7] → preserve exactly as-is[CITATION NEEDED]\cite{key} with entries in references.bib[X] numbering matching the reference listFigures from the outline (embedded as base64 images):
[TODO: Replace with high-resolution figure]Tables from the outline:
booktabs style in LaTeX (no vertical lines)Before delivering the thesis:
- [ ] All outline sections expanded to full paragraphs
- [ ] Consistent terminology throughout
- [ ] All figures numbered and captioned
- [ ] All tables numbered and captioned
- [ ] All equations numbered
- [ ] Cross-references correct (Figure X.Y, Table X.Y, Equation X)
- [ ] Citations preserved from outline
- [ ] Missing citations marked as [CITATION NEEDED]
- [ ] No informal language remaining
- [ ] Transitions between sections are smooth
- [ ] Front matter complete (title pages, abstracts, TOC)
- [ ] [TODO] markers for incomplete sections clearly marked
- [ ] Output format matches user's request (LaTeX or DOCX)
This skill relies on the following:
| Dependency | Purpose |
|---|---|
| markitdown | Extract text from .docx outline |
| python-docx | Generate .docx thesis output AND read .docx outline |
| pypdf / pdfplumber | Read reference PDF documents |
| docx skill | Patterns for .docx creation (if using docx-js) |
| pdf skill | Patterns for reading PDF references |
markitdown — Extract text from .docxpython-docx — Read/write .docx filespypdf / pdfplumber — PDF text extractionxelatex recommended (for CJK character support)User: "Convert paper_outline_v1.docx into a complete thesis"
Action:
paper_outline_v1.docx using markitdownthesis_v1/ directoryUser: "Convert paper_outline_v1.docx into thesis_v1.docx"
Action:
paper_outline_v1.docx using markitdown or python-docxthesis_v1.docx using python-docxUser: "Expand Chapter 4 of the outline into full thesis prose"
Action: