e-x-a.org » Minimizing LaTeX

Minimizing LaTeX installations HOWTO

Many of us would like to use latex in some kind of application to automatically generate some PDF or other pretty printable files (in my case it was about invoices).

Problem of embedding LaTeX into something is obvious – it’s HUGE, including it brings you either a difficult-to-install system dependency, or makes your software package unnecesarilly big, or just hogs up the disk space. The topic has been commented several times, for example here: [groups.google.com]

My solution to this problem was, similarly as seen in google groups, a brutal file-stripping session. This can be a lengthy process, therefore I decided to use the superusuable strace utility, which speeds up the process to several tens of minutes.

Identifying the needed files

First, notice the size of original LaTeX installation on my box:

exa@his-box $ qsize -s "texlive-"
app-text/texlive-core-2010: 456 files, 88 non-files, 8541.190 KB
dev-texlive/texlive-documentation-base-2010: 50 files, 18 non-files, 3076.89 KB
dev-texlive/texlive-basic-2010: 1473 files, 141 non-files, 17973.510 KB
dev-texlive/texlive-fontutils-2010: 152 files, 49 non-files, 1406.198 KB
dev-texlive/texlive-latex-2010: 1033 files, 80 non-files, 10252.244 KB
dev-texlive/texlive-fontsextra-2010: 16513 files, 731 non-files, 339282.500 KB
dev-texlive/texlive-fontsrecommended-2010: 5402 files, 272 non-files, 148534.897 KB
dev-texlive/texlive-latexrecommended-2010: 1644 files, 86 non-files, 13770.348 KB
dev-texlive/texlive-pictures-2010-r1: 475 files, 118 non-files, 6660.651 KB
dev-texlive/texlive-music-2010: 375 files, 53 non-files, 5905.935 KB
dev-texlive/texlive-metapost-2010: 174 files, 61 non-files, 1892.797 KB
dev-texlive/texlive-plainextra-2010: 106 files, 31 non-files, 1092.994 KB
dev-texlive/texlive-latexextra-2010-r1: 2366 files, 856 non-files, 22625.348 KB
Totals: 30219 files, 2584 non-files, 567.397 MB

That gives us a whole lot of stuff to optimize! Let’s use the strace tool to see what files are necessary in the process. Use it on pdflatex processing a file that contains possibly all features you will need in your minimized latex distribution – you could otherwise end up with something missing.

strace -f pdflatex file.tex 2> strace.log

Now extract the interesting information

grep ^open strace.log > files.txt

Now we need to strip the ugly open(”...”) part from the file list. I personally prefer being lazy and using vim with regexes for that, you can also use sed with following two regexes (in order):

s/^open("//
s/".*//

After that, you should edit the modified files.txt so that all files are included only once (as some could be opened multiple times. Classical |sort |uniq is pretty good for that. You may also delete references to dynamic libraries (which you certainly wouldn’t distribute, especially those system-ish like libm.so).

Finally, to see the size of the result, take your cleaned files.txt, and run something like this:

ls -lLs `cat files.txt` |awk 'BEGIN { sum=0; } { sum+=$1 ; } END {print sum;} '

which should present you with a number in kilobytes that signifies the size of the minimized LaTeX package you are about to produce. In my case (the invoice) it was striking 2865KiB, which is WAY way way better than original half-Gig.

Minimizing the installation

After you have the files.txt complete, you can just backup the files listed, uninstall system texlive distribution, and unpack the needed files back.

Please note that you might have to update the ls-R files that TeX installation generates into its folders for (this is my speculation..) faster searching for what’s present.

If you end up with errors, the most probable cause is that you’ve missed something in your files.txt, or created it using a TeX file that doesn’t really cover the complete feature set needed. I suppose you had a backup prior to texlive deletion, so there’s no problem with retrying! :D

That’s all, hope it helped. Happy TeXing!