So now that I have my Python program searching the way I want. I need to be able to use it on the web. Which reminded me of a silly little Python 2 program I wrote a long time ago that I had on my site at one time. The program is called bigNum.py and can be accessed here. It maybe silly but also interesting because it tells you how to pronounce very large numbers. For my search program, this program is a good reference, because it’s simple, it doesn’t use any complicated web framework. It only uses cgi (hopefully available in Python 3) to get the input from the form. I just manually write out HTML. No PHP needed!
Looking for a way to add “search” to a static site. I considered using grep offline on the html pages. Then referencing the web page. But grep gave me problems. I have to assume that with something used as much as grep, the problems were due to my misuse. Be that as it may, if I can’t get it to do what I want then for this purpose it is useless
Success! logfile/posts$ grep -il "vm/" *.html 339.html 345.html 457.html Success! logfile/posts$ grep -i "vm/" *.html 339.html:<h1>VM/370 emulation</h1> 339.html: Ran VM/370 in a IBM S/370, ESA/390, and z/390 hardware emulator on Linux. 345.html: Today I'm more comfortable using MVS in Hercules emulation than VSE and VM. Mostly because MVS didn't take as much work as VM/DOS to actually do something useful. It was more useful as they say...out of the box. 457.html:<h1>IBMs VM/370</h1> 457.html: Continuing retro. Now using the hercules emulator for IBM Mainframes and also the 3270 terminal emulator. Here's VM/370, IBM's first version, released in 1972...way before Linux knew what a VM was or, for that matter, the world knew what Linux was. This is very much like the VM systems I was employed to maintain... $ Right now I have all posts in one directory, but I’m considering breaking them up in subdirectories by year. So I need to be able to search the directories recursively. So much for recursive search working This from the help…. -r, --recursive like --directories=recurse logfile$ ls -la total 116 drwxrwxr-x 3 bill bill 4096 Aug 28 14:45 . drwxr-xr-x 35 bill bill 4096 Aug 25 18:52 .. -rw-rw-r-- 1 bill bill 2846 Aug 22 18:23 about.html -rw-r--r-- 1 bill bill 3832 Aug 28 14:45 grep.txt -rw-r--r-- 1 bill bill 59270 Aug 25 21:59 index.html drwxrwxr-x 2 bill bill 36864 Aug 25 21:59 posts Subdirectory one up Failure! logfile$ grep -r "vm/" *.html logfile$ Nothing returned! I read something saying grep has a problem with wildcards, so... Failure! logfile$ grep -r "vm/" . logfile$ Again nothing returned! I read that grep has something to do with REs global regular expression So why did “VM/” work in the 1st examples? However (finally on my own research) I suspected the slash… As you can see below “escaping” the forward slash worked, However I didn’t escape it above...and it worked! That’s inconsistent! logfile$ grep -ril "vm\/" . ./index.html ./posts/339.html ./posts/345.html ./posts/457.html logfile$ Finally. I read you can use the “-F” switch to do a regular, non-regex search. But that didn’t work either! Supposedly you can avoid regexs by using -F. However… Failure!Failure!Failure!Failure! logfile$ grep -r "vm/" . logfile$ grep -Fr "vm/" . logfile$ grep -r --fixed-strings "vm/" . logfile$ grep -F -r "vm/" . Finally I just wrote a ~40 line Python program that does what I want. As you can see it recursively searched 2 directories. Now was that so hard? (base) bill@bill-MS-7B79:~/Mystuff/Python3$ ./pyFind.py Enter string to search for: vm/ Searching for: vm/ Found in: logfile/posts/342.html Found in: logfile/posts/453.html Found in: logfile/posts/336.html Found in: logfile/index.html Files searched 458 (base) bill@bill-MS-7B79:~/Mystuff/Python3$
Fixed many broke image links. These almost 500 simple html posts are only ~500k total. For now can do simple grep…”grep -il “mainframe” *.html. And grep can find a string in a split second.
Changed filenames to assure they’re valid. So this is what it looks like today. Still some link issues. But not bad for a few days work.
Well my Python SSG worked really well locally, however as I previously mentiond, FTP wouldn’t copy some filenames with special characters, slashes for example, over. So I must rethink the file names.
I completed the main parts of generating a Computer Log Static Site. Almost all the Blog posts are created many of the images display. The Index shows all 456 of my posts. There are some image linking problems. The image link problem’s are from using WordPress’s upload feature. The images linked to my image directory work fine. I have a problem with forward slashes (which designates a directory in Linux, in the file name, for example RSTS/E so I changed those to back slashes in the file name. This worked on my Linux file system. But FTP failed to upload them on my Linux server. So I may have to rethink the filename. It took ~38 seconds to create the Static Site. All the Python code ran fast. But calling the external program to convert markdown to HTML put the brakes on the speed.
Started work on my own WordPress to markdown, to Static Site using Python. Got as far as creating modified markdown that includes date and title. Output filename changed to “date/time title”. Bypass external comments. Plan on using Python nodule (md-to-html) that converts markdown to html.
Talked about it here. This whole topic started because of a WordPress search failure. I don’t think I want to devote the time right now but possibly in the future. Like I said my judoplaces static website was created in 2013, using Python/SQLite long before SSGs became a thing, before that the it was created with MS-Access. Thinking more about it, especially after recently reviewing the steps to create from WordPress. I think I could consolidate the last 5 steps into 1 step with more control including breaking up one long index page to multiple pages. Then again I could just do all those steps initially, since they already work, and then going forward, update the site using markdown, perhaps store the markdown in SQLite. Having the log in SQLite would allow me to easily retrieve the entries in date order or search on any field, rather than making Python do it. Just brainstorming right now!
The WordPress to Static Site by the Eleventy SSG went pretty well. Even considering my admitted lack of interest in web design. I made some very basic changes and improved the look considerably. The only or at least biggest thing missing is search. But there are many possibilities right off the top of my head. Perhaps the simplest is simply running a grep on the markdown files which will show the Title. Or perhaps a simple Python program that can go through the files and present a html select screen.
After solving (so far) the FTP problem, I finally moved a test small sample of my static log file, created by 11ty to my server and it didn’t work! The problem was not using relative addressing. Or am I missing something? It seems…for a static site generator, that I should be able to simply move the files to my server and it would work. I do seem to remember reading someone’s tutorial about having to fix link addresses. Surely I’m missing something…maybe a switch? Unless 11ty is only designed to work with netlify? All the other tutorials I read seemed to use that method.
Anyhoo I wrote a python program to fix the hrefs in the main index.html. In hind site I could have simply used a text editor to change ‘href=”/posts/…’ to ‘href=”posts/…’ but at the time I thought I would have the program do more. After which…it worked. Isn’t pretty (that can be worked on), but it worked! Web design isn’t my thing but I did add some borders/white space. And here it is, maybe a few small fixes needed. No PHP…no SQL. Just static site goodness. I think I said it before but I’ll say it again, I find the fact that 11ty reads the date/time in the front matter, and sorts the index correctly (newest entries at top) very cool. Some entries such as Star Trek also include images! Today I’m approaching 500 WordPress posts and wouldn’t want to recopy everything, every time I made a change or new entry, so I can use Filezilla’s copy options to not rewrite existing entries.
So to recap, steps to create Static Site from WordPress.
- Use WordPress export to create xml and save locally.
- Run the blog2md-master utility to convert local xml to markdown.
- Run my Python 3 program fixMD to change Hugo front matter to 11ty front matter.
- Move markdown files to 11ty posts sub directory.
- Run “npx eleventy” to create static site.
- Fix article links in index.html to use relative addressing.
- Move files in ‘_site’ to my website blog sub directory.