🌞
.
उपलब्ध नहीं है
इस दस्तावेज़ का अभी तक ⁨Hindi⁩ में अनुवाद नहीं हुआ है। आप ⁨English⁩ संस्करण देख रहे हैं।
FPM Journal
Thoughts: Browsing History
Readership Tracking
Linking Using Package Dependency
So links never break (and original author can keep updating document URL at whim).
Git Tracking
More than one fpm package per git repo
Article Out
(on how having article.page allows us to get just the content of a page, without header/sidebar/footer etc).
Special Packages

21st Feb 2022

We recently spoke about WASM packages. We have fpm packages that will be used to distribute static assets. We want FPM packages for font support. We also need FPM packages to distribute syntax definitions and syntax themes. We can maybe create FPM package to distribute SQLite or CSV data files.

In many cases the advantage of creating FPM package exists, even when the end users of the thing being distribuetd, font, sqlite, syntax definition, is not going to use FPM itself.

Say you have created a font, you would want to distribute the font to both general audiance, and to people who want to use your font in a FPM package. In such a case it would make sense for you to create FPM package as FPM package can become the website for you, tomorrow when we have FPM repo in place you can easily monetise and otherwise control the distribution of your font.

Package Interfaces

Each kind of special package, will have a package interface, eg for fonts we will have fpm.dev/font, which will designate that this package is a font package, and is being used as font package. Both would be required? Not sure, explicit is better, so there is that.

FPM cli will verify that the package is a proper font package if you say it implements: fpm.dev/font, which could mean anything, say font file must have so and so file name, or must come with fallback fonts, or it must have some meta data to be used to create CSS @font-face statement etc.

Based on the interface fpm will do something custom, font will be used in the generated HTML, syntax definition and syntax theme would be passed to syntact during syntax highlighting and so on.

Static FPM Repo

20th Feb 2022

Once translation feature is polished, and special packages are imeplemented, processors are done, we have to start thinking about FPM Repo. It is planned but no work has started on it. FPM Repo is quite ambitious, we want to do a lot but how do we begin?

One option is we become target for static hosting. FPM is going to have dynamic features, but we have not yet built them, so we can at least do static hosting. Once static fpm repo is done we can look at dynamic stuff (fpm processors running on page request time instead of publish time) and CR/online editing stuff (not sure in which order, static -> dynamic -> CR seems logical because dynamic does not have frontend UI, is purely backend work, and also maybe we will build CR/editor etc using dynamic FPM).

Read Permissions

If we have static hosting, we can give access to read permissions. When using github pages/vercel, the permission system is pretty limited, Github Pages only allows private to people having access to some repo, which is great for an engg team, but can not be used for your entire company for example. Vercel has better story with support for multiple auth providers, like Okta, AWS Cognito, Firebase, but there is no “platform support” for authorisation rules, meaning if you are using Hugo and hosting on Vercel, it’s not easy to just say allow access to poeple from my company, you will have to do some coding.

With Static FPM Repo we can support authorisation rules in FPM.ftd itself. If we compare with Hugo on Vercel, it would be akin to Hugo supporting authorisation.

Hugo and Vercel are independent products, and any integration between the two requires effort. FPM and FPM-Repo are being developed in parallel and can have complimentary features.

Publishers Advantage

Imagine if one can say this fpm package is only readable to people in my org, meaning they have a @foo.com email address. Or to people who have followed a certain account on Twitter, or liked a certain repository on Github etc, or part of some Discord or Telegram community etc etc.

The first version of fpm-repo is being designed to realise this publisher advantage. We can start with something simple like Google Signin for company email verifiction, and then add other plugins for Slack, Telegram etc later.

Processors And Host Method

19th Feb 2022

We currently ship 3-4 processors as part of FPM. These are part of FPM source code, and are implemented in Rust. We also ship a few “host methods”, which FTD can call using message-host event handlers. These are implemented in JavaScript.

Having only a few such processors and methods is obviously not great because people are going to have arbitrary needs, and we can not hope to anticipate all of them and ship all such methods. We have to make it such that people can write their own.

WASM: Processor And Host Method Packages

Ideally we should be able to distribute them as FPM packages themselves, so no one has to install cargo or npm etc. This is possible if we somehow use wasm, and ship wasm files as part of FPM packages.

We have two alternatives, wasmer and wasmtime.

Processor Use Cases
Currently processors are written for parsing toc, which is pure computation. Then we have a processor to read file (include processor). This one needs disc access. We have a processor to do HTTP request and another to do SQLite queries against sqlite databases stored in FPM package.
Processors And Reproducibility

If an FPM package is pure, or depends on current package content, it is reproducible, meaning if I add that package as a dependency to some package, the package will work as expected. But if a processor allows a processor access to some resource only that machine has, say it makes an HTTP request to some intranet API, or to a database which only allows connection for given IP, and credentials, you would not want to share credentials of course. So if such a package is included as a dependency, it may not work, or it would require extra setup before it can work.

Not much we can do about it. We do not want to curtail power of processors.

Processor Sandboxing

Are there any limits to what processors can do? We will have to allow processors to read files on disc, maybe even edit them. Good thing is we have some level of control as WASM can not automatically execute arbitrary code, it can only do computation and interact with system via methods we explicitly provide.

Should we limit them to files in the package or let them read from anywhere? It becomes tricky as some use cases can exist which will need extra access. How about network access?

I think we can emulate a lot from the permission system of deno. This is good prior work, so we should try to start from there.

We can start with a very limited sandbox, only giving readonly access to files in current package.

Cloud Consideration: CPU and RAM Limits

FPM is not going only run locally. Tomorrow we will have fpm-repo, and we may even hope github pages etc start supporting fpm-repo. So what one can do using the processors has to be designed in such a way that when fpm is not running locally but on cloud, things stay manageable.

One option is we expect every fpm-repo to run everything in an isolated server, with CPU and RAM limits imposed at docker /pod etc level.

Other option is we build resource checks in FPM itself. It seems something like this is possible with wasmer. Ethereum implementation of wasm has support for limits. Restricting RAM should be possible based on my current understanding of WASM model (wasm runs against pre-allocated memory).

In general, having such limits is also good for FPM running locally. We want people to have as much confidence as possible when running untrusted FPM packages, we do not want crypto miners and other such nasties running on our end users laptops.

WASM Limitations

WASM can do pure computation, but doing arbitary thing from WASM is not possible, we have to provide access to features or functions that WASM can call to do things. But if some processor wants to make calls to arbitray C libraries etc, it’t not going to be possible in WASM model.

WASM we are chosing for making it safe to run arbitrary programs so putting limitations on what is can do is required. We can not have both, arbitrary access and safe.

Processor WASM API
For now we will only allow one method: fpm_host.get_document(id: String) -> Option<String>, this will allow a package to read a file in current package. We will have a permission “can-read-package-files”, which will have to be enabled before this will work.
FPM.ftd
-- fpm.dependency: amitu.com/include-processor as aip
allow: can-read-package-files
implements: fpm.dev/processor
We have explicitly allowed this permission. Every package that wants to act as fpm processor has to implement fpm.dev/processor interface. We can use this processor in our FTD file now using:
-- import: aip

-- ft.code:
lang: ftd
$processor$: aip.include
$file$: hello.rs

Here we are trying to use amitu.com/include-processor, which is aliased to aip in FPM.ftd file. By using explicit import we get to use namespacing offered by FTD.

In future we will slowly allow access to TCP, HTTP, more file operations and so on.

Host Methods and WASM

Host methods run on frontend, in Browser, after page load as part of event handlers. We can use WASM there are well.

We are going to have a “rich stdlib”, we do not want to create meaningless fragmentation. We believe in batteries included. We have the luxury of only including the code of a host method if that method is being called from the current page. So a big library is okay.

Host methods will be able to call the “fpm stdlib”, it would be tricky to detect what libraries have been called tho, one can not do source code analysis, source is already compiled into wasm. Maybe we can analyse wasm to see all calls to outside world. But for the purpose of this discussion lets assume it is possible (if not we will have to create some sort of opt in, a method can only call stdlib functions it has declred in some form of manifest file, sucks, but what can we do, we can not include entire stdlib in all pages generated using FPM after all).

host-method package dependency
-- fpm.dependency: amitu.com/parse-date
implements: fpm.dev/host-method
Any package that provides a host method shoud implement fpm.dev/host-methods package interface.
-- ftd.input:
$on-change$: message-host: amitu.com/parse-date.parse, error=error-variable-name
> target=where-to-store-date
Say want to parse the date entered by the user in an input field, and we have a package amitu.com/parse-date, which provides a host method “parse” that can do this.
Where Would WASM Files Be Stored?
We can store wasm files in FPM folder. Maybe we can have each wasm file export a single function, so we only include the minimum amount of code. For processors we can club all of them together in one file.
Should Processors Be Callable From Frontend?

Currently the backend processors only work with data, from FTD’s perspective. FTD contains a bunch of variables, and processors create mode data that gets added to FTD variable space. Frontend methods also interact with FTD via variables.

The capabilities available to frontend does not make sense for backend. But backend one may make sense for frontend. Though not backend only functionality is used, like file system access or database query, network etc.

There can be some overlap, some functionalities should be available both in frontend and backend. But it can not be generalised.

Static Assets From Dependencies

9th Feb 20222

For static assets we have to solve three problems.

Where are the assets stored?

In the main package, we create a .build folder. The dependencies are checked out in .packages folder.

For a dependency d.com/a, the asset images/foo.png will be stored in .build/_assets/d.com/a/images/foo.png.

There is not much to this part.

When to copy?

Should we copy all assets in .packages folder in .build/_assets? Or should we do it on demand?

Easy answer for now is lets copy all static (non ftd and md files a package are considered static). In future we can optimize this.

How to ensure paths work?

This is the trikiest part.

So say d.com/a/lib.ftd has a component:

-- ftd.image logo:
src: images/foo.png
We are going to use relative path here. Technically if someone uses absolute path, then we can still learn if the absolute path is part of package or outside, e.g. if they used:
-- ftd.image logo:
src: https://d.com/a/images/foo.png

This is the worst case scenario. They should not use it as if they do, when they are developing logically they would not see the local modifications to foo.png without publishing.

But technically we can rewrite these URLs too. We do not want to continue to download from d.com, and increase their hosting bills, and also take the risk of them blocking assets via some tweak in their nginx conf etc, leading to our site giving 404.

So we should rewrite the URL to: /_assets/d.com/a/images/foo.png and now it would be server by our hosting providers and things will be in our control.

Repeating, technically we can rewrite such URLs as well, and maybe we should. Then comes next hard part:

d.com/a/lib.ftd
-- ftd.image logo:
string src: /images/foo.png
src: $src
We have to change the src again to /_assets/d.com/a/images/foo.png. Should we rewrite all paths? What if this was part of the main package? Or what if we did this from our main package:
-- import: d.com/a/lib

-- lib.logo:
src: images/foo.png

Now is this referring to /images/foo.png from our package, or /_assets/d.com/a/images/foo.png?

We can argue we will only handle relative URLs and not absolute ones. Further when we come across a relative URL we will resolve it to full URL based on current package. We already track the current package, so this should be easy.

So same string in different package will refer to different values. This becomes more complex in this scenario:

d.com/a/lib.ftd
-- string logo-1: images/foo.png
-- string logo-2: images/two.png

-- ftd.image logo:
string src: $logo-1
src: $src
main package:
-- import: d.com/a/lib

-- lib.logo:
src: $lib.logo-2

-- lib.logo:
src: images/two.png

Here both values are same, but should they resolve into same URL or different URLs?

What if we say neither URLs are right and the lib should include path including package name:

d.com/a/lib.ftd
-- string logo-1: d.com/a/images/foo.png
-- string logo-2: d.com/a/images/two.png

-- ftd.image logo:
string src: $logo-1
src: $src
main package:
-- import: d.com/a/lib

-- lib.logo:
src: $lib.logo-2

-- lib.logo:
src: amitu.com/images/two.png

In this we are never having to worry if images/foo.png refers to amitu.com or d.com/a. It’s explicit.

It is still a “relative url” because it does not start with / or protocol.

So we look at all relative URLs containing package name, and either remove the package name: amitu.com/images/two.png -> images/two.png or add _assets: d.com/a/images/two.png -> _assets/d.com/a/images/two.png. We can even resolve https://d.com/a/images/two.png -> _assets/d.com/a/images/two.png.

If the relagive URL doesnt contain current package or dependency name then we leave it as is.

Theme designers would be encouraged to use their package name in the assets.

Base And Versioning

Instead of resolving https://d.com/a/images/two.png -> _assets/d.com/a/images/two.png etc we can do https://d.com/a/images/two.png -> $HOST$BASE/_assets/d.com/a/images/two.png when --base argument is used. Where $HOST would be either http://127.0.0.1:8000 etc or https://amitu.com. This is to ensure image URLs not depend on BASE. This way we can use $BASE for handling markdown URLs.

Markdown URL problem: tutorial/ should resolve into $BASE/tutorial, this works if base is common for the entire project as has been so far. But when we have versioning, we may want tutorial/ to resolve either into /v1/tutorial/ or /v2/tutorial/, and this will depenend on current version. So when we have haversion support we will want to keep base free to handle versioning, but we are assuming assets would be shared across all versions.

On main:

4th Feb 2022

So Shobhit has implemented the “main” proposal. What we are trying to avoid is imports at top of each file. With main we manage that. But now we lose discoverablity, where does a symbol come from. In main proposal the fact author is coming from foo.com/book-theme is not obvious as you are writing:

FPM.dev
-- fpm.dependency: foo.com/book-theme
implements: fpm.dev/blog
fpm.dev/blog declairs main: book, author which tells you that because of the implements every module in current package has access to author.
Revised proposal
The idea of main: was to ensure we do not have all the code in index file, so we can then have auto-import feature:
FPM.dev
-- fpm.dependency: foo.com/book-theme
auto-import: foo.com/book-theme/author
auto-import: foo.com/book-theme/article as ft

With this author would be available to every module in current package.

Further we will not have : (reason: when using / we get a full URL that we can paste in browser and see doc of individual module) and we will always refer to current package either as package keyword, or we can even allow as alias syntax in the -- fpm.package: line.

Static Assets From Dependencies
Lets say we let any package specify some exports, these could be images or even ftd files (eg contact-us, about-us can be auto created without cluttering).
theme/FPM.ftd
-- import: fpm
-- fpm.package: theme
-- fpm.export: images/
default: true
-- fpm.export: sample/about-us.ftd as about-us.ftd
default: false
When I am using the theme I can be explicit:
-- fpm.dependency: theme
include: *

include: * would be default. We have can have include: defaults to only import default ones. We can also include individual patterns explicitly.

Note: exports are not considered part of package interface.

Can we import any file?

Not seeing any reason why not. There is a question of backward compatibility, I as a theme author may want to move things around, things that are not explicitly exported are private to me.

This feature is largely so components exposed by a theme can reliably refer to assets made available by theme. Ideally only theme components should know about existence of such.

Auto Include In A Name-spaced Folder

In .build we can create .build/foo.com/blog-theme/images/bg.png etc, and we can do it on demand (only if any component that refers to bg.png is actually used while rendering any document it would be copied). Since assets in this scheme can never conflict, they can considered opaque, so the package can not refer to them.

Problem would be say a theme has a few images and some component in that theme lets you pass one of the those images as component argument: so how would we refer to that image? Full URL or the partial URL? We can refer partial URL and since component knows its part of some theme, it will resolve. But what if in author package there is some other image and I do not want theme provided URL and use my URL? We are then forced to give full URL and not partial. A component in a theme refers to images in the theme via relative URL, eg images/bg.png. One option is we always use full URL: foo.com/book-theme/images/bg.png. At least in this case, because URL starts with package name, we know it belongs to that package, and so we can know if its either https://foo.com/book-theme/images/bg.png or https://author.com/FPM/foo.com/book-theme/images/bg.png. If the theme component referred to the URL using protocol, then it would be absolute and we do not want to analyze.

Ideally we should copy on access.

Package Notifications

3rd Feb 2022

I was going through MagicBell and I quite love them. It gave me the idea that people today almost universally hate the email subscription dialog, and yet universally are driven to clicking that red icon.

Imagine any FPM page shows a notifications icon like this in the header. We do not have login but we have cookie, we keep track of what you have read via cookie or local store. In the future we can also include signup via email option as in the notification drop down. Even subscribe via browser using Push API.

Email and Push will require server side support, but basic notification without that is also quite useful and can be easily implemented: if we somehow solve the how in FPM package the notifications are stored problem.

What to show in drop down and link for each of them has to discoverable by fpm somehow. Maybe we can have:

fpm.ftd notification variable
-- record news-item:
caption title:
string link:
optional string src:
body description:
datetime created-on:

-- news-item list news:
So people can create notifications in say config.ftd:
-- import: fpm

-- fpm.news: Package Notifications
link: /journal/#package-notification
created-on: 2022-02-03 11:30AM IST

A spec for showing notifications in header has been just added to Journal.

We will ask people to keep only some meaningful number of items in news, say 10, and keep deleting the old ones. Or we can keep the entire archive and only consider the first 10 when showing the notifications. We can also show all the news items in /FPM/news/ page.

We will create a package: fifthtry.github.io/notifications which will act as the widget, read the fpm.news and show. We will have a message-host function to store the latest notification created-on in a cookie, and on page load the value of the cookie would be made available on fpm.latest-news-read variable, so any news items later than fpm.latest-news-read will be shown as unread.

We will mark the news as read when the notification icon is clicked.

To impelment this we need date-time type ideally. Else if we store created-on as string, we still need > check. We also need a way to get first item from a list. To show the dot on the closed icon state, we need check of cookie value vs created-on of latest news. created-on of latest news can be stored as a another fpm variable for convenience till we get first item of a list support. Once the UI is open, we will want to show what all is new, so we will def need string comparison operator, >.

FPM Package Registry

Currently FPM is fully distributed, each FPM package is published on a URL controlled by the owner of the package. To aid discoverablity tt would be cool if there was a package registry though, a central place where you can quickly discover packages.

Is this going against our decentral design? Not really. FPM does not really depend on the registry unlike other package managers. You can depend on any package and if some package is removed from package registry, all it will lose is a listing one some site. The package itself would be still available and anyone who want to use the package can still use it.

Registry is just a handy place to quickly find themes, components etc, we can easy show all (most) packages that implement a given package interface. We can even show some stats like how many packages depend on any given package, which can be a good proxy for people who want to create a dependency on that package, a package that is quite popular for example maybe safer to use than a package that just came up yesterday and no one is using it.

We will make fpm.dev the “fpm central” or “fpm package repository”, which means eventually fpm.dev will have dynamic stuff. Can we build the dynamic stuff using our proposed FPM Web Framework? Time will tell. We will not block on it. FPM package repository can tomorrow become an open source real world showcase of fpm web framework when it is in place tho.

fpm ping
The way the registry would work is by a package pinging the central repository that the package has been updated. The ping itself will not be trusted, we will not have API keys, signup etc. fpm ping will simply call an end point and pass the package name as the only information. FPM registry will then schedule a crawl of the fpm package and get the information from the published package.
Registry and News
We just discussed adding news support in FPM packages. FPM registry can show the news, and may give quick notify me when this package has news button next to each package listed.
Registry And Web Hook

Say you are authoring an FPM package and it depends a few FPM packages, it would be cool to auto re-publish your fpm package every time any of the dependencies changes.

Since fpm ping is notifying the registry when package changes, fpm registry can tomorrow also notify arbitrary webhook end points when a news is published about any package. You can listen for that and review and republish or auto-republish.

Decentralized Registry?
The first idea was a centralised registry, but we can have say 10s of registries each gossiping with each other telling about updates, so there is no central point. And any registry you publish to, you can expect the eventually the entire registry network will know about updates to your package.
More on package.zip

2nd Feb 2022

THe idea is we will create package.zip in .build folder whenever we do fpm build. Currently the zip file we download is generated by Github and contains the source of entire repo.

In the package.zip we will include the entire repo.

- FPM.ftd
- src
  - index.ftd
- code.rs

In this case the repo has a single package, and the build will simply include all the files of the package minus the files ignored by fpm build (we auto ignore some files like .git etc, and we allow FPM.ftd to specify other files and patters to ignore).

Consider a repo with multiple FPM packages:

- en/
  - FPM.ftd
  - index.ftd
  - a.ftd
  - image.png
- hi/
  - FPM.ftd
- code.rs

In this case when we are are calling fpm build we have to cd into one of the two folders, and each en and hi will get their own .build folders containging their own package.zip files.

The zip will still include all the files in the repo based on previous ignore logic. We will introduce a new ignore heuristic, if you come across a file that belongs to another FPM package, auto ignore it as well. So when building package.zip for en folder, code.rs will be included but not hi folder.

A package can opt out of auto-ignore-other-packages-in-zip heuristic by overwriting this key to false in FPM.ftd. This heuristic is only to reduce the package.zip size.

The package.zip will also contain FPM.manifest.ftd file which will be currently contain only one setting: the location of FPM.ftd file that should be the starting point. Since if we have disabled auto-ignore-other-packages-in-zip heuristic, en/.build/package.zip will contain both en and hi folders, and without the manifest file we do not know where the FPM.ftd file is so we need the manifest file.

Confusion On Ignore

Currently we use “ignored files” terminology to refer to both files that never want to know about, eg .git files, vs the files we know about but currently we want to not copy them to .build folder, eg .history or .track folders.

Now onwards “ignored files” terminology means files that ignored from all fpm commands, they are as good as they are not on disc.

Among the non ignore files, there are files like .history folder files which some fpm commands need but not others.

Currently we assumed ignored means files that fpm build ignores, but now we want to say ignored files are files that entire fpm system ignores, and non ignored files but files which are still ignored by fpm build has no global name, its fpm build’s internal detail, and fpm build handle it how so ever is effecient.

code.rs is part of package?
Say we have this folder structure:
- en/
  - FPM.ftd
  - index.ftd
  - a.ftd
  - image.png
- hi/
  - FPM.ftd
- code.rs

When someone imports en package, they can import en:a.ftd, but can they also refer code.rs directly somehow?

Decision: no, code.rs is part of package.zip, but is not considered part of en package. Only files directly insider en, or in case hi files in hi or in en (as part of our fallback logic) are considered part of the package.

So you can import hi:a, even though there is no such file because we have fallback logic.

Bind Proposal
One option is we do not let people go out of en folder, and if any file has to live outside, that file can only be accessed after it has been explictily bound In FPM.ftd:
-- import: fpm
-- fpm.package: en
-- fpm.bind: ../code.rs as foo.rs
-- fpm.bind: ../images
This is like we are implementing our own file linking feature.
Solution 1
With bind the generated package will simply copy the bound folder. So we will not care of repo structure at all when generating package.zip. When bulding package.zip for en, we will start from en folder, the zip will contain all content of en folder, and then all bind files/dirs. So we will do cp ../code.rs foo.rs and cp -r ../images .
Solution: Final?
We have explict bind in FPM.ftd. When generating zip we keep the repo structure, we only copy files that are bound. We will also generate manifest file.
Removing .zip Key

29th Jan 2022

When creating a new package we have to write, e.g.:

-- import: fpm

-- fpm.package: fifthtry.github.io/dark-mode-switcher
zip: github.com/fifthtry/dark-mode-switcher/archive/refs/heads/main.zip

Which is annoying. We need the publish URL, and that is good, we are going to share it everywhere. But we need zip also, and its only used by fpm for internal purpose (downloading the package). But writing the zip key and reading it is both noise for humans.

My first thought was we auto generate the zip URL by parsing the package name, which would be fragile and won’t work if someone is using custom domain etc. Then it occured to me that we can create the zip file and upload it as part of the build itself, so the zip file url will always be <package-name>/package.zip.

It has one positive aspect: zip url as defined currently can diverge from the published package. People may see one thing on package url but may download something else, the latest code on main. The build may not have happened on subsequent commits, so things can diverge. The zip as part of package ensures we always get the source of what we see.

This can arguably be the downside as well, if there is something wrong with the package, one can not just fix it by committing on main, one now have to figure out how to update the zip. I am not sure what will be the cause of this tho.

Overall I think we should go ahead with this proposal.

Scroll To View Host Function

Was wondering how would we implement this site. They have a single page, and all header links simply scroll to different portions of the same page. The scroll behaviour is also smooth, not abrupt.

We can implement the abrupt scroll behaviour currently, we give each heading a unique id, or authors can specify their own id. So link targets can be #foo. But this will jump abruptly.

So we will give the following fpm host function:

-- object scroll-to-foo:
function: scroll-to-id
id: foo

-- ftd.text: About Us
$on-click$: message-host $scroll-to-foo

This will use the “smooth” behaviour.

We should eventually be able to do:

-- import: fpm

-- ftd.text: About Us
$on-click$: message-host $fpm.scroll-to:
> id: foo
Or even better:
-- import: fpm

-- ftd.text: About Us
$on-click$: $fpm.scroll-to: id=foo
Markdown Styling

Me and Shobhit were wondering how would we fix markdown styling. Currently we have a css file that contains styling that gets applied to markdown generated html and currently there is no way to update that CSS.

We want any theme to be able to update the markdown styling. And yet we do not want to let anyone modify CSS, we consider CSS a liability, an escape hatch, which will hamper our plans to write non HTML based rendering for FTD.

When we parse markdown we get elements like heading, paragraph, link, image, tables, and lists. We eventually realised if we had a syntax like this:

-- ftd.markdown:
h1: ft.h1
h2: ft.h2
link: ft.link
paragraph: ft.paragraph

the markdown text

We can solve our problems. ftd.markdown would be a new ftd kernel element which will take a bunch of ftd.ui constructors, which will then be invoked for every heading, paragraph etc that we encounter in the markdown.

This way any theme can create their own markdown wrapper component and pass it all the component constructors, and completely control the rendering.

Syntax And Theme Packages

26th Jan 2022

Recently discussed “Thought: Font/Asset Packages”, and we had a need for packages that can be used to distribute fonts, and other assets like images. When designing some theme we realize we have to give users ability to modify the syntax definitions, we currently ship with whatever is shipped with syntect and a few extra syntax definitions in ftd repo.

We have to allow people to define both extra language support and theme support. Also we have to ship language and themes as FPM Packages, so they can added as dependencies. We would need some meta data in FPM Package to indicate they are special theme or syntax definition package. We can do it by package interface, if a package is defined to implement a package interface, e.g. fpm/syntax of fpm/syntax-theme, we would look for special files in such packages that are dependency of current package and make them available to ftd.

-- import: package?

Currently when you see -- import: foo, foo can either be package name or alias, or foo can be a module (ftd file) in current package. What if we stop doing the later, and the only way to import foo.ftd in current package was: -- import: package:foo?

We can go even one step further, and allow package name itself to be aliased (if needed), and always use <current-package–name:module-name> when importing.

So both? I tend to prefer one. Rust does crate. But they also have to rename crate to any name, and I tend to use that in all my Rust code instead of crate.

Release Automation

21st Jan 2022

Shobhit has been implementing some release automatons off late. Now our Github Action creates binaries for the three supported platforms (we only run our unit tests on Linux, so technically Windows is not completely supported right now, but we still want to, if you come across any issue please let us know and we will resolve asap). The workflow also creates a release.

We have a script, install-fpm.sh, which we recommend you use for you CI systems for installing latest release of fpm. It supports both Linux and Mac. Also it lets you chose if you want to use the latest pre-release or the latest release of fpm.

You can use install-fpm.sh --pre-release to get the pre-release. Without the argument it installs the latest release.

FPM.dev, ftd.dev and a bunch of FifthTry sites use fpm in --pre-release mode so as to catch bugs and regressions early.

TOC Syntax
We have been discussing the toc syntax off late. fpm comes with an ftd $preprocessor$, toc. Currently it has the syntax like this:
-- toc:
$processor$: toc

- https://google.com
  Google
- /some/url/
  Some URL
  - /some/nested/page/
    Supports Nesting as Well

It has a few issues:

  1. We have both title and URL as compulsory parameters, we want to allow people to create TOC entries that are not yet linked with any URL, and show them in TOC without a link, for creating entries for chapters or topics that are coming soon.
  2. We want to be able to specify src: to be used as icon for some entry.
  3. We want to add “separators” in TOC
  4. We want to be able to use “part headings”, these are not links, but are used to group a bunch of chapters together.
  5. We want people use font-icon like fonts.
  6. More compact link specifications when it makes sense.

- <title>
  url: <url>
  disabled: false
  src: asd
  font-icon: fad fa-bell

  - <title>: <url>
  - --
  - src: foo.png
  - <title>

Title now comes in first line. If there is space you can put title and URL in same line. url, src, font-icon, disabled etc other meta data you can pass to an entry in toc.

Nesting is still there, by two spaces. You can use free spaces before and after links, but not between the properties.

Package Interfaces: Strict Vs Compatible
17th Jan 2022
-- resume.me: Amit {bold: Upadhyay}
total-experience: 18 years

--- bio:
hello, this me.
--- ft.image:
--- ftd.row:
--- ft.markdown:
--- ft.image:

--- container: resume.me

-- ft.markdown:
resume-1.ftd
-- ftd.row me:
caption name:
string total-experience
resume-2.ftd
-- record person
caption name:
string total-experience
ftd.ui bio:

-- person me:


-- ftd.column:
spacing: 20

--- me.bio:

-- resume.degree: BTech, Mechanical Engineering
university: IIT-Bombay
graduation: 2003

-- resume.job: CEO
company: FifthTry, Inc
industry: Developer Tools
start-date: 2020-August

--- resume.role: CEO
start-date: 2020-August
Another Thought On Prelude

14th Jan 2022

So yesterday we settled with this as solution for our prelude problem:

-- fpm.dependency: fifthtry.github.io/Jupiter-theme as j
book: j.book
ft: j.ft

After a good night sleep I feel we can evolve this further. The thing is we did not discuss a “default” exports for every “package interface”, nor are we using the planned implements syntax yet. The idea is when you import a dependency, you may want to use it only because it implements the package interface of a package you are interested in. Say you want something that implements fpm.dev/blog, and you found Jay/Fancy-Blog-Theme. Now Jay/Fancy-Blog-Theme may implement more than just fpm.dev/blog. It may implement other packages. Also Jay/Fancy-Blog-Theme may define some components that are not part of fpm.dev/blog, which means if you used them, switching to another theme would become hard as that other theme may not implement Jay/Fancy-Blog-Theme.

So we do not want to access all modules and symbols of a package, we want to constraint what modules and symbols are “visible” to us. We can do that using proposed implements syntax:

-- fpm.dependency: fifthtry.github.io/Jupiter-theme as j
implements: fpm.dev/blog

Using implements does three things: 1. ensures fifthtry.github.io/Jupiter-theme has modules and compatible symbols as defined in fpm.dev/blog, 2. any attempt to access any extra modules are symbols, that are not part of fpm.dev/blog lead to error, and 3. makes fpm.dev/blog an alias for fifthtry.github.io/Jupiter-theme, so if any document has imported fpm.dev/blog, they will be getting fifthtry.github.io/Jupiter-theme instead.

We can use implements more than once on a dependency.

Now if you look at the two code snippets, its obvious that book: j.book is conflicting with implements. We are defining the same thing twice. We are going to have to definitely use the implements clause. But also the book: j.book.

Can we somehow do it in one go? This is not just a question of reducing the amount of things you have to write, its also about creating a consistency. Consider if a package can define default imports / “main” modules:

-- import: fpm

-- fpm.package: fpm.dev/blog
main: blog, post, author
Say with that definition of main, say this:
-- fpm.dependency: fifthtry.github.io/Jupiter-theme as j
implements: fpm.dev/blog
Is equivalent to:
-- fpm.dependency: fifthtry.github.io/Jupiter-theme as j
implements: fpm.dev/blog
blog: j.blog
post: j.post
author: j.author

Meaning if any ftd file now on-wards refers to blog, they will be getting j.blog, i.e. j.blog is auto imported in every ftd file. And we do not even have to write the explicit lines, we follow implements, and look at main for the package being implemented, and auto import the main modules.

So while it obviously is a lot of “magic”, we are getting a consistency, everyone refers to fpm.dev/blog.author with the alias author, even when we are using different theme. So modifying the theme becomes easy. All the main imports would get well known soon etc.

What will happen when say both fpm.dev/blog and fpm.dev/book have defined say a module named author?

In this case we can go on and define things explicitly:

-- fpm.dependency: fifthtry.github.io/Jupiter-theme as j
implements: fpm.dev/blog

-- fpm.dependency: fifthtry.github.io/Some-Book-theme as j
implements: fpm.dev/book
book-author: j.author
Lets assume fpm.dev/blog has main: blog, post, author and fpm.dev/book has main: book, author, author conflicts between the two. So just putting the two dependencie without author: j.author line, eg:
-- fpm.dependency: fifthtry.github.io/Jupiter-theme as j
implements: fpm.dev/blog

-- fpm.dependency: fifthtry.github.io/Some-Book-theme as j
implements: fpm.dev/book
would be an error. fpm will say both expose author and its ambiguous. So we will have to explicitly map the author to some other name for one of the packages.
-- fpm.dependency: fifthtry.github.io/Jupiter-theme as j
implements: fpm.dev/blog

-- fpm.dependency: fifthtry.github.io/Some-Book-theme as k
implements: fpm.dev/book
book-author: k.author

Here we are saying author will refer to j.author, the default, and k has given up author and used book-author name.

We can even support this syntax:

-- fpm.dependency: fifthtry.github.io/Jupiter-theme as j
implements: fpm.dev/blog

-- fpm.dependency: fifthtry.GitHub.io/Some-Book-theme as k
implements: fpm.dev/book
author: book-author

Here we are explicitly renaming author to book-author, and we do not have to care about the alias.

Closing note: this is a lot of “magic” indeed, but consider that in scientific computing world, machine learning world etc, using well defined aliases for common Python modules, e.g. np for numpy, pd for panda, plt for matplotlib are almost universally recognizable. We are creating few such aliases, book, post, etc, which we hope to similarly become universally recognizable, and so this magic is acceptable. Creating a consistent naming, so any ftd document you come across in wild can be easily copy-paste-able in your project is easily achieved by this. You will always know what book is if you come across it anywhere, and can quickly pick from a huge number of themes that are implementing book.

More Thoughts On Search: And Host Architecture

We have to first implement simple things like “object constructor” in FTD. This is the core to how we pass data from FTD to “host”. We similarly need a “special variable” $VALUE, which will be available only in $on-change$ handlers, so we can do $on-change$: $query = $VALUE, assuming we already have a variable defined: -- optional string query:. These are easy bits.

Then we have a harder bit. In FTD the DOM structure is rendered by Rust code, and from front-end we can only modify some variables, variables that show-hide an existing DOM node, or change some attribute of existing DOM node. This is why we have to make every field conditional individually, we have to update our JS to change the DOM node.

What we have to do now is an ability to generate DOM node in front-end now. Search is going to fetch search result from some service, and we have to create a child corresponding to each search result. We can do a hacky thing, pre-create say 10 DOM nodes, because we know the page is going to have 10 search results only, but this is not a great answer, say consider if we had tree, with up-to 4 level, we will have to pre-create 10 * 10 * 10 * 10 DOM nodes.

So me and Arpita decided we create a proper support for creating DOM based on component definitions from front-end. We will pass a specification of components that are needed dynamically (we do a dynamic variable dependency analysis already). And then we will write some JavaScript code in ftd.js to create DOM based on the component spec, and the data (it will happen every time the data changes).

We already have a bunch of JS logic, this change will introduce quite a bit more, still quite low compared to other JS library standards, our current ftd.js is only 3.3KB gzipped, it will barely increase it to 10KB at worst. This will increase the amount of JSON we include in page, as now the specification of all the components used dynamically will also be included. Our compresses index.html for example is only 12KB (which includes JS/CSS etc everything), so we still have quite a small built size.

We have done some interesting parsing for what we call “ftd markup”, we will have to port that code over as well. Which means we have to have a full markdown parser in JavaScript, which will further increase the size.

We can do this in a conditional way, only include markup/markdown parser if one of the dynamically called components need it. Or only include dynamic component rendering part of JS if there are at least one dynamic component. But even in worst case I do not see our compressed, everything included HTML going above 20-30KB, which is quite acceptable.

One very interesting consequence of this feature would be that FTD would become a general purpose UI library that can be used with any JS project. We would be almost like Elm, like Elm has “ports”, we will have message-host to send data from FTD to JS, and ftd.set_var family of functions to set FTD data from JS.

This means you can tomorrow write full web applications like Twitter and GMail with FTD. Your JS will have no idea about DOM, your JS will simply set FTD variables, and FTD will take care of rendering them, and will call your JS from time to time, based on user events etc via message-host.

Thought: Theme-able Internet
When we are consuming a site, say HackerNews, we get their UI. But it would be cool if anyone can “theme” their UI. How could we do that? One option is we, or someone, maybe even the owner of the site, create a “data type” and “package interface” for every site, and a “converter” which will crawl the site and create data of site in the package interface format ftd files. Then people can create their own theme implementing the package interface, and we have theme-able internet.
Lint Vs Warning Vs Error

Was reading Rethinking errors, warnings, and lints, and I quite liked it. We should have a strict mode in ftd and a regular mode. In strict mode we fail on any thing that maybe wrong, eg un-used import, variables or components with improper case, un-used component arguments and so on. We will try to be as strict as possible. Maybe even bad formatting should be considered an error in strict mode. All arguments in component definition must come before the properties for example.

We will then have fpm build not use the strict mode, so authors can quickly try things out. And finally we should have fpm build --release, which will use ftd in strict mode, forcing people to ensure all sloppy-ness is fixed before something is published.

We can also have fpm check which will test for all warnings, and can be used as part of pre-commit hooks.

Added And Removed git2

We used to depend on git when doing validation of fpm.package.zip URL. This failed on Ganesh’s Windows machine as he does not have git cli installed. We learned that Github Desktop does not install git cli on Windows.

So Shobhit converted our code to use git2 (libgit) as a Rust crate dependency, so we do not depend on external binary. I further ensured that git2 is compiled with bundled SSL and LibGit libraries so our CLI becomes more self contained.

While I was doing that I realised the reason we use git is currently not strong enough, so Arpita removed git2.

FPM Editor Plans

In future we may be bringing git2 dependency back though, when fpm becomes a full blown “editor” as well. We want fpm to one day be only binary authors need, it will be the editor, so you do not say SublimeText etc, it will have git internally so you can clone and commit from fpm editor itself, we will make fpm run as a web server, and maybe a side by side preview with auto reload.

Integrated Authoring Environment it would be, IAE.

fpm-ui -> fpm

We used to have a special module named fpm-ui with a bunch of variables for translation etc. It used to also contain ftd-ui.mobile and fpm-ui.dark-mode flags as well. We also had another special module fpm, with other set of similar variables.

It was an implementation detail, there was some feature missing in FTD (ability to refer to record fields defined in another module), and Arpita did not like “polluting” fpm with such variables which would tomorrow become record fields, so she created fpm-ui.

Since then Arpita has implemented the fix, you can now do foo.bar.baz and foo maybe another module, foo.bar an instance of some record, and .baz a field on that record. So she cleaned things up by removing fpm-ui and moving all the variables to fpm.

We have not yet properly documented those variables which we will be doing soon.

The Struggle With Prelude

13th Jan 2022

We had a feature called prelude. You can define a file FPM/prelude.ftd, and this file is prepended to the top of every .ftd file in the package when its being evaluated by FTD.

Motivation
The motivation for this feature was to not have to repeat some lines in every file, and also so it becomes easy to change one place and have it affect all files. There were two main use cases for this:
-- import: some-dependency.some-module-in-the-theme as ft

Here we wanted to be able to import different modules from inside any dependency to all files.

We can do dependency aliasing, so some-dependency is defined in FPM.ftd as -- fpm.dependency: some-org.github.io/Some-Dependency as some-dependency.

The other main use cases was config. In every package we may want to set some global variables that should be available to all FTD files in that package. We can but them in say config.ftd, and then in FPM/prelude.ftd we can import the config, and it becomes available everywhere.

This Approach Did Not Workout

The problem with this approach was that it fails if you import any local FTD file from FTD/prelude.ftd. Consider we have a document foo.ftd, and say foo.ftd imports bar.ftd, if we put -- import: foo in prelude, this line would be included when bar.ftd would be evaluated, which means bar.ftd will try to import foo.ftd, and foo.ftd is importing bar.ftd, which leads to recursion.

Now we can technically try to solve recursion issue, it’s a general deficiency of FTD that we can not handle recursion. We should be like python, which allows such recursive imports, but symbols available to each other when two modules import each other is only the symbols defined till that point for one of them in worst case.

We can and will try to make our system more robust, but I did not like putting that fix as a dependency for prelude. It is possible that recursive dependency would cause surprises, and we may not be able to solve it cleanly. So I want to hold on to question of recursive module imports and not link it with prelude.

We decide and removed this prelude feature recently .

The Second Try
Instead of FPM/prelude.ftd we decided what if we have a section prelude in FPM.ftd itself:
-- import: fpm
-- fpm.package: some-package
-- fpm.dependency: some-org.github.io/Some-Dependency as some-dependency
-- fpm.prelude:
ft: some-dependency.some-module-in-the-theme
foo: foo

So we would import some-dependency.some-module-in-the-theme as ft in every document. This tells us explicitly which documents are being imported, eg foo and we would not include the prelude in files that are listed in prelude section.

This sounded like a good idea, but it too did not workout. Say foo.ftd imported bar.ftd. When evaluating bar.ftd we would include the prelude stuff, so bar will try to import foo, and we are back to recursion. We could have done some document analysis, see that bar is importing foo and foo is “special”, so treat bar too as special and do not do auto import prelude stuff, but we felt its too magical, and generally did not like it.

We also considered and then rejected to put the entire “prelude” chain in FPM folder itself, so we know what documents are special, and prelude can only import from FPM and not elsewhere, but it too was too much magic. And there was also a risk that it may bring every document in FPM folder due to transitive dependencies.

The Next Attempt

We finally decided, we will give up on this line altogether and look at it more deeply. Why can’t we have a normal module prelude.ftd or whatever you want to call it, and most of the files in the package can chose to import prelude and get everything they want. This way we do not have anything special happening at all, everything is explicit.

To implement this we realize we need a feature, we want to be able to import a bunch of “symbols” in prelude, and export them from prelude. Eg if prelude is:

-- import: foo
-- string message: hello
We should be able to do:
-- import: prelude
exposing: *

-- foo.hello: $message

We felt quite good about the design and Arpita wrote up a quick PR.

Unfortunately I am not at all happy with it, and here is the thought process.

Import Star Is An Anti Pattern
As soon as I saw the code snippet in the PR:
-- import: record
exposing: *

-- import: lib
exposing: amitu, default-phone

-- ftd.boolean: $bool

-- ftd.text: $amitu.name

-- ftd.text: $default-phone

-- address:

I started regretting it. Here things are still not too bad, imagine if both exposing had used *, we wouldn’t know if amitu and default-phone came from record or lib.

In virtually every language import * is considered an anti pattern for this reason, or it should be. To ask readers to look at multiple files to figure out where some symbol is coming from. Sure a good editor can help, you can Cmd-click on a symbol and go to it’s definition. But it requires tooling for reading, and FTD we are designing with hopes that low end editors can also be used without issues. This is why ftd does not have indentation.

But isn’t import * a good patter for prelude? In Rust for example, many libraries recommend:

use diesel.prelude.*;

If we go with this thought process, then too the PR is wrong because exposing only exposes “symbols” defined by the imported module, meaning variables and components etc. It does not bring in imports from that package.

So if this was the prelude file:

-- import: foo
-- string message: hello

And we did -- import: prelude exposing * (btw we should also support this “compact syntax” along with the exposing header to make things one-liner), it will bring in message, but not foo.

And prelude must not define new symbols, it should be a just a file to import a bunch of things and create a useful common place.

So as it stands the current PR is both an anti pattern and useless for the task at hand.

Yet Another Proposal

Is this the fifth proposal? Not sure. I swear Arpita this was not on purpose, to create an elaborate joke.

What if we did this:

-- fpm.dependency: fifthtry.github.io/Jupiter-theme as j
book: j.book
ft: j.ft
A Detour: Index Only Or Package Interface Specific Documents?

In FPM we have concept of “package-interface”. The idea is any package can act as a package interface, and then any number of other packages can implement that package interface, and all packages implementing a given package interfaces can be used interchangeably.

Currently the few packages we have so far all put all their code in index.ftd file. Actually this was so till a while back and when Arpita was creating a package interface for slides, she created a bunch of documents, slide-1, slide-2 and so on.

We have been thinking of -- import: foo exposing * from the point of view of making package index the “main” place which contains all the symbols exported by a package, but we can still define them in separate files, so index.ftd does not become too big.

But the question is why? Why did we feel a need to bring everything together into a single index.ftd instead of importing different modules? Because of our package aliasing. When defining a dependency we have as <alias> syntax to make it easy to alias the package and then use the package name as a single word from any module.

-- fpm.dependency: fifthtry.github.io/K-theme as k
With the above we can do:
-- import: k

In any document and then use things defined on k using k.foo name. But it only works if k contains all the symbols we care about. If they are spread in different modules we have to do more imports.

It was a small thing, but now the prelude is under discussion, we do not have to worry about it much.

And here is another more important point, which forces us to not have any symbol in the index.ftd file, and leave it only for documentation etc.

See, a package may implement more than one package interface. Say we have a package interface FifthTry/article, which defines a bunch of components for writing “articles”, which can be say blog post content, or book content, or content of a slide show etc. Article defines headings, code block, image, paragraph etc, which all prose will have.

Similarly we have a package FifthTry/slide which defines things like slide and slide-intro etc.

Now when someone is using a theme for slide, say a unicorn theme for slide, that theme will implement slide from FifthTry/slide but also h1 from FifthTry/article. Now if we put the two together in a single index.ftd, authors will have to write:

-- import: unicorn-slide as s

-- s.slide:

--- s.h1: slide heading

Problem here is not content is making the assumption that both FifthTry/slide and FifthTry/article are implemented by a single theme. What if we wanted to use one implementation of FifthTry/slide but another implementation of FifthTry/article, we will have to change our content files to use two modules.

So we have decided that every package interface will define one or more modules that will be as unique as possible to that package interface (other package interfaces should avoid using that name). So we should have written:

-- import: unicorn-slide.slide as s
-- import: unicorn-slide.article as a

-- s.slide:

--- a.h1: slide heading
Now technically we could have also written:
-- import: unicorn-slide as s
-- import: unicorn-slide as a

-- s.slide:

--- a.h1: slide heading
If we did this then everything defined at index.ftd in unicorn-slide was okay. But how do we “force” authors to do this? If they do not do that they will have harder time switching themes. So if FifthTry/slide had defined slide.ftd with slide component it would be forced by all authors to do the former.
Coming Back To The Proposal
So lets review this:
-- fpm.dependency: fifthtry.github.io/Jupiter-theme as j
book: j.book
a: j.article

With this we will now auto import all dependencies eg j in all documents. Further we will also import j.book as book and j.article as a and so on.

This way if we wanted to use FifthTry/article from some other theme, we would do:

-- fpm.dependency: fifthtry.github.io/Jupiter-theme as j
book: j.book

-- fpm.dependency: fifthtry.github.io/Some-Other-theme as k
a: k.article
And we will make the following an error:
-- fpm.dependency: fifthtry.github.io/Jupiter-theme as j
book: j.book
a: j.article

-- fpm.dependency: fifthtry.github.io/Some-Other-theme as k
a: k.article
But What Happened To config.ftd?

We still will have to import config.ftd from all files. What can we do about it?

We can do the same thing at current package as well:

-- fpm.package: fpm.dev
zip: github.com/fifthtry/fpm.dev/archive/refs/heads/main.zip
config: config

For now lets say we only allow one key config, and it will be auto imported in all documents. Do not import other documents from config and you will be good, else we could be back to recursive import issue.

We can also make this automatic feature, if a file config.ftd is present and even if the key config is not found we can auto import it.

Will it work? I guess we will only find out after Arpita has implemented it and it comes time for me to merge it.

Thought: Package Identity

2nd Jan 2022

Still can’t believe 22 is here! So I have been thinking about fpm update recently and came across nostr that implements something very close to what I had in mind.

I kind of consider nostr design the right design, good balance between centralised (prone to attacks, censorship etc) and “pure” peer to peer (complex, less reliable etc). Maybe there are issues, but to me it sounds good.

So we know how to distribute packages. What is missing is package identity. Package names depend on DNS if you buy your own domain, and with others, eg github if you host on <username>.github.io etc. So how do we ensure you really own the package identity? That you can move from .github.io to something else without losing people who may be interested in your content.

Traditional way requires you to do a redirect from old to new. But this relies on old service still running, often companies come and go, domain can expire or sold to someone with no interest in honouring the old contracts.

If not domain name then what? One candidate is cryptographic keys. But they are too hard, at least when you are targeting billions of publishers, like Twitter and Facebook does. Eventually your FPM packages should be your twitter/Facebook profile, but how do you manage identity?

WhatsApp etc, E2E, End To End-Encryption systems rely on some shared identity, eg a phone number of email address as a convenience, and then they generate some keys behind the scenes, you can see the fingerprint of your keys in whatsapp for example, and verify each others identity (phone number appears to be the identity from the layman’s perspective, but its backed by strong cryptographic foundations).

So let me repeat, when you are sending a message on WhatsApp, you think you are sending it to a phone number, but WhatsApp has internally exchanged some encryption keys with that contact, and its going to that key.

Let me say it in a more formal way, in PKI, whatever that might be, anyone can generate a key pair, a private key and a public key. This key pair can be considered an identity. When you send a message it goes to someone, that someone needs to be identified, and how we identify them can be considered their identity. What you can do is encrypt the message with the public key of the recipient, and only the corresponding private key can de-crypt the message.

So in this sense, a key pair can be your identity. But key pairs are just numbers, and unlike phone numbers, they are much bigger. So exchanging these numbers is not fun, it’s opaque, no one can remember them, nor can see what they are referring to if we see a huge number representing a public key in someone’s FPM.ftd file as a dependency.

So we need a way for the key pair to get an alias. The domain name is a good alias. If we tread DNS as alias, not the source of truth, things are much better. Identity moves are rare, but they do happen. People switch providers, people may want to move from Github Pages to another service, and back. Once someone starts following someone, such moves should not affect them, the followers should continue to get updates despite name changes. So how do we achieve this with PKI and aliases?

Imagine Github Pages implemented our protocol, this is how it could work. They generate a key pair for you when you sign up. This is important if we want billions of people to use services. Most people are not going to want to bother with all this. This is what WhatsApp/Telegram do to deliver strong cryptographic guarantees (at least by design/promise, implementation is another matter) to masses.

So when you create an account on any of these services, they create a new unique identity (key pair) for you. You start using that service, you gain followers, and say you get serious. This is the point you will bother with all this identity business details. Now you have say 100s of followers and you want to move to a different provider, without losing the followers (without breaking 100s/thousands of packages that depend on breaking).

At this point you will now claim your identity. You install another app, offline only for strictest guarantees, and generate your real identity, real key pair.

Now if Github Pages etc are acting in good faith, they will let everyone know that they are temporary guardians of your identity, that you have not yet claimed your identity, and the moment you do so they will cooperate. If they chose to act in bad faith, charge you money for you to claim your identity or not let you do it while initially promising that they will, then you could lose follower, but hopefully word will spread and people will shun such services.

So the temporary guardian of your identity can accept your real identity keys, and everyone who is following your via <username>.github.io now learns of your true identity as well (and that Github is an intermediary working for you for now, via you signing the Github generated keypair with your real identity).

So real identity has always been the keypair, but we would be using the domain name. We need a distributed store of keypair to DNS mapping, so you can switch the domain for your keypair. Some sort of “key server” is needed. Your followers where initially thinking that the Github generated keypair was the true identity, but since you update your identity, github will notify them about the true identity and your FPM will now store the true identity of what you are following.

We also will have to allow for “key revocation”, say one of the guardians went rogue, got bought out, hacked etc, and starts sending updates on your behalf to your followers, you had signed that key-pair with your key, so now you have to revoke that key-pair. The key-server can do that as well.

So the protocol is: in FPM.ftd you see domains. You keep a cache of domain to keypair id on your file system. Periodically you check with key-server if the domain for any keypair has changed, if so fpm commands start reporting warnings that domain entries are out of date, please update ASAP. fpm can now chose to, if you so want, start using the latest entries from key-server for the identities, so say if dependencies say you are publishing on <username>.github.io, but since key-server says domain moved to <username>.com, fpm will start fetching from <username>.com.

Basic Translation UI Done!

1st Jan 2022

Arpita has been working relentlessly on transition UI and I am quite proud of what she has done so far.

You can see the translation banner and available languages

Consider this page for example, it is originally in English and is being translated in two other languages, Hindi and Gujarati. From any page in the English language one, you can click on the language dropdown to see other languages it is available in, and jump to that corresponding page in that language.

Now say you go to the the corresponding Hindi translation, you see the following:

So this page was translated someday, but the message is saying that is now not up-to date with original, English version. She has used the excellent Project Fluent by Mozilla to translate the UI messages as well. You can see the the Rust code in src/i18n and translation files in i18n folders. The translation itself needs a lot of improvements, please send PR if you find some way you can improve the messaging 😅.

If you click on the banner you see more information about what is wrong, why is this document out of date, as in how much out of date it is, slight typo in original, or major re-write. Based on this knowledge reader can decide to stick with translation or read the original.

We tell you when the Hindi translation was last marked upto date, and what all changes has happened in the English version since that date. This we believe is really unique feature we have brought, we have not seen it anywhere yet, this demonstrates our “document tracking technique”’s strength. We would be using this technique for version tracking (what logical changes has happened in the new version/edition of a book/documentation), change request, two way sync etc.

We also take you to translation status page if you are so interested:

This page is translated in Hindi as well for example.

UI needs work, messaging needs work, but we are so proud of what’s working already. Also you can modify each of those UI by creating and editing FPM/translation/{missing, upto-date, etc}.ftd, we ship the UI shown above as fallback, we first look for corresponding files in package’s FPM folder.

FPM Package Info Page
We are accumulating a lot of package meta-data in generated FPM.ftd file during build time, now Arpita has started displaying that data as well on /FPM/ URL for any fpm built package. Currently the URL is almost empty, but after we merge this PR it will start looking a bit like this:

It’s quite bare bones. Soon it will list information about where to fetch package data from, the LICENSE of the package, how to collaborate, edit this package. We will also include the README.md of the project if its hosted on Github etc on this page.

This page is fully customisable. You can modify FPM/info.ftd file and tweak it based on your need.

“FPM Theme”

So now we are building a lot of UI that is going to be part of a website you create. The website you create using FPM/FTD can be exactly how you want it to be, you have absolute control over every pixel of the page. But then we have our UI elements, things that are needed for FPM to work correctly, FPM/info.ftd file, FPM/translation/*.ftd files and so on.

And you can totally edit those files, copy them from our source, tomorrow we may give you some cli helper for that, and edit them to your hearts desire.

But we do not believe that is ideal. We are going to create a theme, a “document interface” in our terminology, which will contain all the components needed for FPM related UI that gets included in your site.

Once that document interface is finalised, we will create a few themes implementing fpm interface, and hopefully end users can as well. License permitting you can use any of them interchangeably.

Ideally all the main themes should also implement FPM Theme document interface as well, so most theme you find for your blog, resume, book, open source project page, startup landing page etc, will come with compatible elements, but if the theme you picked for rest of your site does not come with the FPM Theme elements, then you can use another theme visually compatible with your theme, or create your own.

You have full power over entire UI, the interaction, style, and you only need to learn FTD to design those pages, FTD something that is lot easier to learn than HTML/CSS/JS for doing this kind of job. FTD that is designed for non programmers even.

Thought: How Should fpm update Work?

So we recently introduced this command, and currently it simply re-downloads all the dependencies, which is fine for our demos and initial usability testing etc, but clearly not something we want to eventually.

We are designing FPM infra to be as “distributed” as possible, this means we can not have a central entity knowing the status of all the packages as is done by many other package managers.

We can poll, re-fetch each <dependency>/FPM.ftd, with etag handling etc, but this is still not great. How often to poll? On every fpm command invocation? Every minute/hr/day? Every time user says fpm update?

Sure. We can come up with some such scheme. But can we have some sort of middle ground? Not central, yet we do not have to poll everyone? Poll is not great, we want push, as soon as the package we are interested in is updated, say I opened my laptop for a few minutes before catching a long flight, I would want my system to automatically fetch all the latest updates. If in the last minute before I close the lid there was an update to a package, I would want that to be available to me.

Posing problem like this, no central, and yet want push, means we create a few, a few hundred, thousand maybe “central” agents. Say there is a new component fpm bastion, probably using that word incorrectly, some sort of fpm aggregator, and say each package can pick say 10 of such aggregators. People can maintain fpm aggregator lists, some will be available via fpm.dev, but say its down, under attack, sold out, there are others lists, and each package may include a list of aggregators they know of so one can look at all aggregators from all the package they currently have.

So there are so many aggregators. And they themselves are chatter boxes, gossiping with each other. For each package they keep track of the FPM.ftd file. We can always use public key infrastructure to ensure the FPM.ftd file I receive from an aggregator is trustworthy (they can lie about not giving me the latest update, but they can not modify FPM.ftd without getting trivially detected).

Each FPM package should pick a dozen or so aggregators from the list. And every time there is an update, every time fpm sync runs, the content is uploaded to fpm-repo but also sent to aggregators.

Aggregators are chattering and they will communicate to others if any change has happened. Aggregators may even be paid to give guarantees that they talk to everyone or that they have strong latency guarantees (the gap between an update happening on any aggregator in world to this aggregator telling you it has happened).

We will have to employ many techniques from distributed hash table literature and prior work. Maybe even use IPFS etc and not re-invent things.

Or we can go old fashioned, sort of reverse mail relay, DNS for any domain (or current version of FPM.ftd if its not too old) can tell you the preferred aggregator for this package. We can also do the aggregator push, like my preferred aggregators are say this list, then when I am fetching a package from your fpm-repo, I can tell the repo about that list, and tell fpm-repo to notify to one of my preferred aggregators.

Anyways, so we have a bunch of aggregators, or we have full blown IPFS infra, whatever. We will now need our fpm to have a demon that will run in background and keep fetching stuff. All of this can be done by your fpm-reader.

fpm-reader would be a service for consuming FPM packages. You can install other packages as dependency of a package you are building, but can also “subscribe” to packages via fpm reader, which can be running on your machine (as part of fpm binary maybe), or on your Heroku/DigitalOcean instance if you have devops bandwidth, or you can buy it from a SAAS provider.

If you are using a cloud fpm-reader (meaning not running on your machine), your fpm daemon will connect with fpm-reader and get the updates via a web-socket/ push mechanism.

Update: Turns out nostr describes the protocol very close to what I had in my mind.

Thought: GlueSQL Based Package Query

31st Dec 2021

We have a feature that allows you to query SQLite databases. In general it is good feature, if you have SQLite database, sure, go ahead and query it. But we call the query processor package-query, because the original intent of that processor was to be able to query a special database that shows FPM package data. Our first design was to do a build-package-database that generates a sqlite file, and then run queries on that sqlite file.

The main idea was we do not want to create our own query language or API, that’s a lot of work, and new thing for people to learn, but we do want to expose rich data so people can make their documents smarter.

Turns out there is a project, GlueSQL:

GlueSQL is a SQL database library written in Rust.

It provides a parser (sqlparser-rs), execution layer, and optional storages (sled or memory) packaged into a single library.

Developers can choose to use GlueSQL to build their own SQL database, or as an embedded SQL database using the default storage engine.

This sounds like pretty much what we want, a SQL interface, but powered by our custom logic.

The main reason to use it would not be just because we don’t want sqlite dependency, or anything is wrong in sqlite, but because build is circular, you can not create package data sqlite database from data in documents because some of those data may be generated from a package query itself.

Not sure what it entails when it comes to actually implementing it, but knowing there is good starting point, we will not have to start from scratch means this solution is now at least conceptually feasible.

Translation Status Coming Along Nicely

30th Dec 2021

Arpita has been relentless about translation support. After finishing the basic features she is now focussing on data presentation. Some of the UIs that are ready.

Translation Status Page
For package in original language (which is being translated in other languages), we create a translation status page now. The URL is <domain>/FPM/translation-status/, eg arpita-jaiswal.github.io/vivekanand/FPM/translation-status/, which looks like this (screenshot added to capture current state, it so we can see the progress):

Here we are seeing the status of the original language, which is English. We see that the English FPM package was last modified on 2021-12-29T20:55 and has 13 documents.

We see that it is translated in 2 languages, hi, 2 letter ISO 639-1 code for Hindi, and gu, Gujarati. For each language we see how many of original 13 documents are “Partially? translated but Never Marked up to date”, “Missing (translation not yet started)”, “Translated was done at some point by is now Out-Dated”, and “Up-to Date”. We also see when was any of the packages last modified.

This gives a very good overall picture of the project. We plan to add word counts too as progress of library work, which is what FPM packages contain often, are better tracked by word counts, not file counts.

We have to try to strike a balance between too much data making the UI confusing vs too little data leading to people having to click around more than necessary to get the information at a glance. We will have to do usability testing, engage with proper UX people etc. But its a good start, quite happy with it.

Language Detail Page

So you saw overall status of translations across all languages, but you would want to dig deep and see more details for any language. For this we have Language Detail Page, which has the URL: <domain-of-translation>/FPM/translation-status/. Not too fond of naming the page “Language Detail Page” but the URL containing “translation-status” when we already have another page called “Translation Status Page”. But I like the URLs too, its technically translation status of that language, but what do call it then, “Translation Status Of Translation Page”? That would be monstrosity, “Language Detail Page” seems obvious.

Anyways, so this is how say Hindi version of arpita-jaiswal.github.io/vivekanand-hi/FPM/translation-status/ looks like:

We are showing the total number of documents in original package (currently it should have shown 13, seems there is a bug), we will also show that this is the translation of “Addresses at The Parliament of Religions”, which is in en, and link back to Translation Status page shown earlier. Maybe we should show the last date on which English was modified. Maybe we should show 5 / 13 to track things.

We then list all the document, currently their document-id, but soon we will show the document-title and document-title-original also (with decent non cluttered UI, maybe tooltips etc. We have to show timestamp of when the original was last modified and when the translation was last modified also in this table.

A Note On Upto-Date-ness Of These Pages

The Translation Status Page is built as part of original language fpm package and Language Detail Page is built as part of each translation. But they translation status page depends on data of all translations, and language detail page depends on data of the current language and the original language package.

Which means if they are not periodically rebuilt, or better built in some event drive way (we know what package we depend on, what if we also knew what packages depend on me, and I could tell each of them hey I have changed!, or maybe we can still have a central repository, or maybe even say orbit-db etc which only keeps track of which package was last built at what time, so we can auto refresh the entire page or just the translation-status.html file.

Other approach could be we do client side fetching of FPM.ftd file, so when you go to Translation Status Page, we show the last built version but we then download FPM.ftd for each of the translations, and recompute the data.

I am not too happy with the frontend approach, every single visitor will re-download everything, think about the global warming people, FPM is trying to be network resilient, if we create multi-planetary civilisations, HTTP based lets just fetch what we want when we want won’t work, and such federated, change when needed, show stale but refresh etc approaches would be needed.

UI Customization

The screenshots you see are the UI that’s shipped as default with fpm, but you can easily customise it by creating FPM/translation/original-status.ftd and FPM/translation/translation-status.ftd files.

We will soon create a “document interface” for FPM Package Info, which will include translation banners (will talk about them soon), translation status page, and upcoming: package info page, package and document history pages, change request pages and so on.

Further we have to figure out how do we internationalise the UI messages.

UI Message Internationalisation

We have two large level approach for UI message internationalisation. First is the tried and tested, i18n approach, say we use Mozilla’s Fluent. We can define special strings we have in our source code, and we invest in comprehensive translation so no one has need to improve it (meaning it can not be done by writing FTD files, you must recompile fpm).

I am not fond of that approach, primarily because while we can do that for strings we know about, but what about themes and other kind of content people create? Say a theme for book (document interface) will have terms like “table of content”, how to centrally translate such information so that each user of that document interface in that language doesn’t have to keep deciding how to translate things.

We need to think more about this.

Thought: Language IDs
Currently we are using 2 letter ISO 639-1 code to represent language names, eg in FPM.ftd file:
-- package: foo.com
language: en
translation: foo.com/hi

2 letter codes are okay, but in FTD/FPM we tend to stick with full names instead of short versions, like integer instead of int, or language instead of lang, since we believe things are written once and read many many times, and we have immense capability in getting confused. How many people can guess gu stands for Gujarati and not some African language? What have we really saved by writing gu, instead of writing “gujarati” which has to be written precisely once in the package which is probably going to have tens or thousands of words?

We have a few problems here. One we must ensure the language names are exactly the same, we all call it either “Gujarati” or “gujarati”. This is because we can then easily re-use translations etc, standardisation is good in general.

We can use the “ISO Language Name” column as shown in the Wikipedia (or find out where did they get the actual data from). They have mapped more than one names to same code in a few cases, which I feel is actually a good thing, we can be more precise in language name, honour the preference of the author, we are interested in the code, which we will get.

So this is our first decision. We will proper case also, so language: Gujarati as it is proper noun, and not language: gujarati.

Then the other problem is how to show the name of language in language detail page in a form native speakers of that language can understand, so “ગુજરાતી” we should show. But the translation status page we should show probably both ગુજરાતી and Gujarati.

Special Languages

So I was thinking of the two names for each language, and how both are relevant information, should we allow both somehow? Say language: Gujarati | ગુજરાતી, we will ignore the part after |, so that can be used as some sort of description about the translation.

This got me into thinking if we must have a requirement that there can be only one translation in one language, since if not this allows us to also explain what is happening, so we have syntax, but do we have a need?

This got me into thinking what if there was a book “Rust Programming Language” and someone translated it into “Rust Programming Language for Advance Haskell Programmer”, it will remain in English, but the audience has changed, advance Haskell programmers probably do not need such a laborious, long explanation, maybe their book would be just a 10 page book instead of the 100 (random guess) page current book?

After all the purpose of translation is to make some work accessible to a population who wanted to read it but could not because of language barrier. But isn’t a book written in too much (or too little) technical detail a language barrier itself? Can you read a book written for absolute beginner of a topic if you are in advanced stage, and vice versa? Can these two people really talk to each other meaningfully? Isn’t there is a need for a translation there?

And consider this also: say I am describing the strict specification of what FPM does, shouldn’t there be a document like this? Which assumes you know everything, and you are only interested in exact description of precise behaviour of what FPM does? Of course that document must exist. But is that just a document? It must be a package right? And then we have the “end user documentation” of FPM. Isn’t that just a “translation” of the formal spec?

Conclusion: we will allow language: Gujarati | ગુજરાતી syntax. First part must be language name as per ISO 639-1. But when allowing or disallowing we will use the entire value of language: key. So you can not have two translations with language: Gujarati | ગુજરાતી, but you can have one with language: Gujarati | ગુજરાતી and other with language: Gujarati | ગુજરાતી (simplified).

Thought: Font/Asset Packages

What if there were fonts that can be distributed as FPM package? They can chose to inline the .woff etc file in the package itself, but they can also link to CDN hosted font files (this is a bit privacy issue, we do not want Google etc to know what page is being viewed, and today Browsers re download all assets, I may be wrong, to avoid timing attacks for fingerprinting who you are or what websites you visit).

We can do the same to distribute other assets, say you want to share images with the world, you can upload it to unsplash or create a FPM package and put it on Github.

Soon people will start indexing such packages, and you will get a more direct reference, if anyone is using your assets you will be credited in the fpm.dependency of that package.

One question would be should these assets be copies over or given that each FPM package has a URL those assets are directly linked? Direct linking may not be good, as people will have to pay for their package getting popular. We should copy over.

Should we have a manifest file that tells what files to copy over? Should we have an asset folder which gets copied over? What about if multiple packages ended up taking the same name? We should copy the asset with the full package name (or package alias).

How would we refer to such asset?

-- import: foo

-- ft.image:
src: foo:path/of/image.png

We can make the URL resolve into /assets/foo/path/of/image.png. We can even do this on demand, the asset package may have many files, but we only copy over the ones referenced in current package.

What if this is transitive? If x is being built and it depends on a, which needs asset f1:foo.png. Should we create assets/a/foo.png or assets/f1/foo.png. The problem is same f1 could be imported by multiple dependencies with different names, so we end up having lots of copies of the same.

To solve this folder name must match full package name, not just the alias.

Thought: Testing Improvement (dynamic documents)

29th Dec 2021

Was reviewing the latest PR done by Arpita, and this thought occurred to me: we currently do, what Jest popularised as “snapshot testing”, we create “snapshots” of generated HTML files, and ensure they do not change.

I find HTML a bit noisy, reviewing change in generated HTML is not fun. Further there is a question of coverage - not all features of fpm could have been used in the .ftd file, maybe it is bare-bones, and when someone uses realistic .ftd files, maybe things would not go as expected.

fpm.ftd snapshotting

We are currently creating one main dynamic file, fpm.ftd, its a virtual file, does not exist on disc, is generated on run time, for each document getting rendered, and it contains data related to that document.

We should snapshot the generated fpm.ftd file as well on file system, and ensure it doesn’t change without review, so all the variables are asserted upon.

Special Files

There are some special files, eg translation-stats, its provided by us, and it cant really have any bug per se, I mean we can ship some broken FPM/translation/status.ftd file, sure, but its just a text file, I do not want to see the generated .html file for. Similarly we have FPM/markdown.ftd file, which again, would do what it would do, and not much point storing the .html of.

Not sure. The thought that’s coming to me is in such cases we need not necessarily store the .html file, we just stored the corresponding dynamic context, fpm.ftd or whatever, and snapshotting .ftd is enough. We will have more readable and smaller diffs. I guess I just trying to reduce the amount of code review I have to do 😅.

fpm update Implemented
A few days back, as part of some other larger PR, Arpita implemented fpm update. It’s currently pretty “hacky”, deletes the .packages folder, and re-downloads all the dependencies.
FPM/prelude.fpm Added

Shobhit just implemented support for FPM/prelude.ftd, this file if found, is prepended to every .ftd file being processed (except FPM.ftd file).

Usually you would have a config.ftd file that can store a bunch of things that you want available everywhere. Also you would want to include one or more component libraries in your ftd files. Putting them all in FPM/prelude.ftd saves you from repeating the same lines in every document, making documents that bit cleaner.

Thought: Publishing FPM.ftd With Extra Data, and Better /FPM/ Page

Recently we discussed we will start publishing (copying) FPM.ftd in .build folder during build, and how we can sometime download just the FPM.ftd for an FPM package instead of downloading the entire package .zip file. It sounded like a great idea, and Arpita implemented it right away, and she did one better, she included some extra data also in the FPM.ftd file.

This opens up interesting possibilities, now we can include all sorts of generated package meta data and make it part of the published FPM.ftd file. We can show number of files or word count of the package, we can include LICENSE, even sitemap, latest snapshot versions etc etc. We can anticipate a lot of need, and most of the “metadata” needs of fpm can be satisfied by putting data in FPM.ftd.

So currently we also publish /FPM/index.html, which is generated out of FPM.ftd file in the package. With this idea, of making lot more data available to FPM.ftd that becomes even more interesting.

We will create FPM/info.ftd file, which will have all the FPM.ftd data available, and would be rendered in FPM/index.html, so we can have a proper package meta data page as well.

Source Of Truth: Repo Or Deployed?

For any package, we have two potential source of truth, the repository where you can get the package source from, where it is presumably being maintained, say Github, or the actual built and deployed URL where things are hosted. Both contain FPM.ftd file.

Consider translation status. Say the original package is trying to create a status dashboard for all translation packages, showing like how many documents in that translation are upto date, missing etc. To compute this information fpm can download the latest version of each translation package, but this information is coming from where source repo, like Github, so zip may point to main branch’s .zip file.

If we trusted the main branch the problem is end users are viewing the translation status page, and they may see, oh this package is 100% translated in so and so language, but when the click on the link they may find that its not so, as there is a possibility that things have landed in main branch, but not yet built and published. This can happen because say deploy failed due to some reason.

So it seems that the <package-name>/FPM.ftd should be considered source of truth when its materially different from <package-repo>/FPM.ftd.

Domain In Package Support Done

28th Dec 2021

Seems it takes less time for Arpita to build a feature than for me write what should be built! Just like that she implemented package with Golang like identifiers.

Consider this FPM.ftd file for instance:

-- import: fpm

-- fpm.package: arpita-jaiswal.github.io/vivekanand
language: en
zip: github.com/arpita-jaiswal/vivekanand/archive/refs/heads/main.zip
translation: arpita-jaiswal.github.io/vivekanand-hi
translation: arpita-jaiswal.github.io/vivekanand-gu

Without searching now you can just copy paste the package id in browser and see the content of the package. So much better than the earlier approach of searching for the package in the package repo, and hoping the repo index page links to the published version somewhere.

Arpita pointed out that one of the cool side effect of this design (the fact that FPM.ftd is now downloadable, without having to download the entire FPM package) is that for many cases we will not have to download full package, eg when specifying translations, when a FPM package is specifying the translations, we only specify the package name, the actual language of the package is part of FPM.ftd meta-data of that package, and she used to have to download the entire FPM package zip file just to get the language (so she can show it in the translation dropdown).

Idea: Canonical URL Support

So we are going to approach few open source projects to checkout FPM and our translation features soon. For us to setup the demo properly and get the discussion rolling, it would be good to import a the project we are targeting, convert them to a FPM package and publish on some temporary URL for demo purpose.

But we do not want Google to index our page. We can do robots.txt and block indexing, but its even better to add canonical URL so Google gives proper credit to original URL for improving their search ranking.

So we will support canonical-url key in fpm.package:

-- package: www.amitu.com/rust-book/
canonical-url: https://doc.rust-lang.org/book/
Currently we are supporting it only at package level. This means every URL that we generate, eg /foo/, which is generated from foo.ftd, will be have a:
Note the /foo/ appended to canonical-url
<link rel="canonical" href="https://doc.rust-lang.org/book/foo/" />
In the generate file, with href pointing to corresponding URL.
fpm.canonical-url
In future we will also allow canonical-url on a special module fpm, so any document can -- import: fpm and overwrite fpm.canonical-url: <some url> to specify arbitrary canonical URL on a per document basis if they wanted to.
Document Title Support Implemented

Shobhit implemented document title support yesterday. Till now all HTML files generated by FPM used to have title ftd, after this pull request the title is picked from content of FTD document being process right now.

In FTD we have a concept called region, any FTD element can use region to express their semantic meaning. The idea is to let screen readers, search engines etc discover what is the heading, or what is the role played by different elements, so they can skip over things and focus on main content etc.

We have regions like h0, h1 and so on for headings of different levels. To compute the title we look for the first text block with h0 region, if its not found we look at first h1 and so on. If no region is set then we look for first text element. If that is also missing then we assume the document-id (relative file path of the file being process with respect to package root, without the .ftd extension) is the title.

Dependency Alias Support Implemented

[Shobhit’s PR for dependency aliasing][pr] was merged today. This brings us one step close to changing a theme should be just one line change (its a bit harder than other systems where theming is well defined, you always know what you are theming, Wordpress support Wordpress themes, which only work for blogs, and say Winamp supports Winamp “Skins”, which only works Winamp, you cant use Winamp skin with your Wordpress blog for example[1], with FPM you all sorts of UI, music player to Browser, to pricing page to Resume etc can all be “themed”, and we want cross portability across all of them “as long as it makes sense”: follows the same “document interface”).

This is how you create an alias:

-- import: fpm

-- fpm.package: fifthtry.github.io/amitu
zip: github.com/fifthtry/amitu/archive/refs/heads/main.zip

-- fpm.dependency: fifthtry.github.io/blog-theme as theme

We have used as theme to create an alias theme for the package fifthtry.github.io/blog-theme.

In order to use it you can do this:

-- import: theme

-- theme.h4 : Hello world!

We are importing fifthtry.github.io/blog-theme by importing theme.

Note: We can also import the full package name if we wanted.

[1]: In case you are wondering why would anyone in their sane mind want to include a music player theme in their blog? Say you are discussing the UI of the player, say you want to show how the UI works, what would you do? Take a screenshot? And you call that action of a sane mind? What will happen if that theme changes? Are you going to come back and update the screenshot? That would be lunacy my friend. Just import the FPM package for the Winamp theme, pass the data that Winamp would have passed it (if we were living in the world where people are not, you know, insane – creating UI in platform after platform, even more insane where UI of one “framework” on same platform can’t use UI of another frame, I am looking at you JavaScript, with your framework pared – and using FPM for example), and now you blog post contains always refreshing, true, intractable Winamp screen. FPM makes it possible. Why would you not?

Translation UI Working!

27th Dec 2021

Arpita just finished translation UI (and basic translation feature in general). I started writing update about translation feature on Journal, but I feel this deserves a video update. Let’s do a bit more polish, create different states, currently all translations are missing, do a bit of UI polish and towards the end of the day create the video.

In the meanwhile you can check it out here. Notice the links in the header, they take you to respective pages in other languages.

Super awesome job 🙌, Arpita!

Package Name With Domain?
Currently when we add a dependency on another package, we need two bits of information, the name of the package and the repository where you can get the package from.
Decentralisation

We need to know where to download a package from because of our decentralised design. We do not want a single central infrastructure in FPM like they have in other languages, eg crates.io for Rust or npmjs.com for Javascript and so on. Centralisation has many problems, censorship becomes easy, single point of attack for bad players, difficulty with confidentiality (what if you have some information you do not want to leave your company VPN etc), bureaucracy (you can not get a new feature on crates.io till you go through whatever hoops that there to work with their team: I am sure they are very responsive etc, but they cant be as responsive as your own self hosted repo, that you can tweak and deploy hourly if you wanted to).

But for FPM, centralisation is especially a problem: other package managers only distribute packages to be downloaded once at build time, when you are doing a deploy, you will download the .zip file from package repository, and won’t download again till another dependency is added or version changes. No end user goes to crates.io/npmjs.com etc.

FPM packages on the other hand are actual websites. So FPM repository is used by all end users, and all your site traffic goes to your fpm-repo. One single website or a company can not host the entire Internet (live traffic, not like Archive.org).

We also envisage that fpm-repo will be giving differential features, some may give load balanced deployment with support for Edge caching, some other may focus on Math related computation etc. FPM and FTD has a lot of dynamic features, and that means there is a lot of play in features offered, pricing and business models and so on.

Golang Like Package Naming
Golang recommends package names to contain domain names, including special emphasis on vanity URLs (not just URLs pointing to code hosting locations). Their entire tooling works with domain names:
godoc cloud.google.com/go/datastore

The cool thing is you can open the package name in browser and see the documentation of that package. This is far better experience than Rust etc where if you see a module being imported in any file you have to do a search to go to the documentation of that module.

So can we have:

-- fpm.dependency: amitu.com/realm
Instead of
-- fpm.dependency: amitu/realm
repo: github
fpm build to copy FPM.ftd

For this to work we will need the meta data about the package that we want to download. The URL may point to static site hosting URL, and that URL does not give us entire package source.

We can chose to upload the latest package along with the static build, but that forces us to use .zip path, where as if we know the repo is a git repo, we can do more optimal downloads (eg git pull instead of full download). Git or upcoming fpm-repo also gives us more features like branch, get things from other folder etc, so uploading .zip is not optimal.

We must copy just the FPM.ftd file instead.

The New Package Download Scheme

Currently we assume each package is hosted on Github and construct the .zip URL when we have to download a package, and download it in .packages folder.

What we want to do now is given the package name, append /FPM.ftd to it, download the FPM.ftd file, parse it, and get fpm.package.zip attribute.

Earlier we used the repo key, which was a single key to contain different kind of value, could have been zip URL, Github etc, or the URL of the fpm-repo.

Now we will have dedicated keys, if the package is to be officially distributed as zip file, set the zip key, if its on github, set it with the git-url (URL that git clone understands), and we can support hg/svn/fossil etc as well.

The package name can not contain https://, for cleanliness. We will do what modern browsers do, first check if https:// is opening, else try http://.

fpm build --base=/foo/ for Github Pages

26th Dec 2021

When using fpm build locally vs when using it for GitHub Pages hosted site there is a difference. Locally we build everything in .build folder directly, and URLs are /x/ for x.ftd file. But when hosting a site on Github Pages, without using custom domain, the URL GitHub Pages gives us is <account-name>.github.io/<repo-name>/x/. So every URL is prefixed by /<repo-name>/.

One idea that came to me was we use HTML’s base tag, set the base to either / on local or /repo-name/ in Github Action for GitHub Pages, and use relative URLs in ftd files. Have implemented this feature.

Other idea that came to me after implementing first solution: we create a folder <repo-name> in .build folder, so when browsing locally we get the same path as when browsing remotely, all URLs that we put in ftd files have /<repo-name>/ prefix.

Not sure the pros and cons of the two. Former is implemented already so that’s one pro. Later somehow feels cleaner to me. Not yet sure.

If you are using GitHub Pages, build your files with fpm build --base=/ locally and fpm build --base=/repo-name/ in your GitHub Actions, and use remove use x/ instead of /x/ when creating links in your ftd files.

fpm CLI now Supports Windows

fpm already supported macOS (we develop it on macOS) and Linux (our CI runs our test suite on Linux), but we never tried it on Windows so far.

Ganesh has been trying to use FPM on windows, and he had to install Visual Studio and Rust to build it. He had Windows 7, 32-bit based really old laptop, and turns out Visual Studio itself doesn’t support Windows 7/32-bit combination. He upgraded his laptop’s hard disc to SSD, and updated to Windows 11, and got Rust to compile FPM. We came across SQLite related issue, some SQLite DLL was required, so I switched to bundled SQLite, and it finally compiled.

But the fpm build did not work! It did not find any file. Turns out our code had hard coded / in a bunch of places, Shobhit found and fixed some of those issues, and after this fpm build is working for Ganesh.

Shobhit basically followed Microsoft Documentation about Rust setup and it just worked. Cool thing is the compiled exe works even on machines without Rust, of course, and Visual Studio (was not sure about this) installed.

We have not yet created any automation for creating exe every time a new release on FPM is created, we will figure out this soon (Jan end timeframe). For now if you need latest FPM exe, shout out to Shobhit on or discord channel and he will do it for you.

We have not yet run our test suite on Windows actually, only verified basic feature manually. We use fbt for testing, a testing framework for CLI applications, we have developed at FifthTry, but it too doesn’t support Windows yet.

Thought: Unicode in FPM/FTD

Was looking at this thread on Reddit: Filename with dots or Unicode chars not working and I kind of agree with this sentiment:

I find it little disconcerting that a “modern” language has such an arcane restriction about Unicode filenames.

Reviewing the non-ascii-idents RFC that Steve Klabnik linked to, I kind of only agree with only of the drawback listed there:

Homoglyph attacks are possible.

Overall I believe we should go ahead with it. Rust has decided to despite the issues so should we.

What would it entail? What would we have to do? We should support unicode in FPM package names, as well as identified (component, variable, record names etc) in FTD. Any character is allowed in names? No, ID_Start and ID_Continue character classes are defined in Unicode, so all our “id”s should accept allow them (minis keywords reserved by FTD itself, eg record).

Thoughts: Package Naming Restrictions & $FPM_HOME

Other than Unicode stuff in package names, we actually have some restricts on package naming, so wanted to write down my thoughts about them. Currently all packages in FTM are distributed via Github. In short future ~ Jan-Feb 2022 timeframe we should also have fpm-repo that can be used for publishing packages.

fpm-repo is going to be self host-able, as well as SAAS version. But the distinction, if something is self hosted or on some package repository should not exposed to users of package, so authors should easily move their package from self-hosting to SAAS or back without breaking anything.

Moving from Github to fpm-repo should ideally also be seamless, but currently we expect the package to specify repo: github. We also have domain setting for each package, so domain name can be considered as part of a package, if we did that then the published version of package may include some information about how to fetch the source version of package, so in dependency it would be enough to give the package name, (with domain name as part of package name).

I was going in another direction, but I think I like this decision even more.

Font Support Added (and thoughts on future renderers)

24th Dec 2021

After fixing font issue in FTD, finally we got FPM working with custom Google Fonts (any font for that matter should work).

We had an easy way to fix and a “correct” way to fix it. The problem is Font is not a simple file. Since we have a lot of languages, we can not embed symbols for all characters in all languages in a single font file, as that would become a large file and would slow down websites (and waste CPU/battery/bandwidth). Similarly many fonts design regular font face, italics, bold etc separately, previously people used to just algorithmically slant the characters or make the font bold by thickening the lines, but for best effect font designers create custom design for one or more/all characters for each font style. So we can not put regular, bold, italics etc in one file as well, as that too will be wasteful.

So CSS lets you specify multiple font files, for different unicode range and font face etc, and let browser download the font they need based on the content of the current page.

So coming back to problem: we only had place to store one font file, which clearly won’t cut it. The easy way to fix was suggested by Sourabh to allow people to pass URL of CSS file, eg Google Font recommends something like this:

<style>
@import url('https://fonts.googleapis.com/css2?family=Roboto&display=swap');
</style>

@import, or the <link href="<some-url>" rel="stylesheet"> equivalent can be used. So we can ask people to give us CSS file URL as font file, and we can include it in the page. This does solve the problem, its easy. But it has a problem: it allows people to now provide arbitrary CSS file, there is no way to check the content of the target file to ensure it only contains @font-face declarations, it can contain any CSS, and it becomes a way for people to include CSS in FTD files.

This is a problem because FTD is designed with multiple target support, we currently compile FTD to HTML/CSS/JS, but in future we will compile to Swift code or Kotlin etc for mobile devices, or C code for ncurses based terminal UI support and so on. We are taking inspiration from CSS, the attributes for example, but that’s it, those attributes will have equivalent in Android/nCurses etc, or we will try to degrade gracefully.

If we allow CSS then suddenly our layout kernels can be misused to give arbitrary layouts, and suddenly FTD files won’t be compatible with our future FTD rendering targets. We believe CSS is over designed etc, and so is JS. We can create a browser from scratch without CSS/JS overhead, which is capable of rendering FTD files, so even for browsers we do not need the whole thing. We can maybe even use canvas API to use such custom rendering even for current HTML based browsers.

So we can not include CSS. What we did instead of looked at @font-face from CSS, and created an equivalent for the same:

-- fpm.font: Roboto
woff2: https://fonts.gstatic.com/s/roboto/v29-normal.woff2
style: normal

-- fpm.font: Roboto
woff2: https://fonts.gstatic.com/s/roboto/v29-italic.woff2
style: italic
This is quite equivalent to what those CSS files for font include:
@font-face {
  font-family: 'Roboto';
  font-style: italic;
  font-weight: 100;
  font-display: swap;
  src: url(https://fonts.gstatic.com/s/roboto/v29/font.woff2) format('woff2');
  unicode-range: U+0460-052F, U+1C80-1C88, U+20B4;
}
@font-face {
  font-family: 'Roboto';
  font-style: italic;
  font-weight: 100;
  font-display: swap;
  src: url(https://fonts.gstatic.com/s/roboto/v29/font-2.woff2) format('woff2');
  unicode-range: U+0400-045F, U+0490-0491, U+04B0-04B1, U+2116;
}

We support all the attributes on fpm.font, with simplified names, eg name instead of font-family, or style instead of font-style etc, and removed src in favour of individual font types supported, eg woff or opentype etc.

So now we generate the CSS instead of getting the CSS from user, this was we know how to translate this transparently to Android when we come around to supporting native Android rendering and so on.

In future we may create a utility fpm css2font that will take the css URL from Google Font etc, and auto convert it to ftd.font specs that can be added to FPM.ftd file.

Doc/TOC Subtree Re-levelling Problem

23rd Dec 2021

So I said including content from other package is proving to be challenging. Lets see the problem. We have to modes of including contents from other packages.

Doc Sub-Tree Includes

In this say we have a document, and in that document I want to include some content from another document. If I want to include just an image, or a table or a code snippet from that document then life is easy. But what if I want to include an entire heading subtree?

Say that document has been organised into using heading h1, h2, h3 etc, and say I want to include the entire heading subtree rooted at second h2 or third h1. To not have to complicate life we can given that specific h2 an id.

So we want to include that entire h2 in current document, including that h2’s descendents like h3 etc, all code and images and tables of that h2. We can write something like this:

-- include: the-id-of/the-document
id: the-id-of-the-h2

Now lets assume that the document where we are including that h2 does not have that much complexity, and it wants to include that h2 at h1 level.

This is the “re-levelling” problem. We may want to write something like this:

-- include: the-id-of/the-document
id: the-id-of-the-h2
level: h1
Cross Package Component Visual Harmonisation

First challenge is if the content is from a different package, the two packages may have different ideas on how to show a code snippet or a table etc.

Package Interface, and package aliasing are features that we are working towards to solve this problem. So if you include the package as a dependency and overwrite it transitive dependencies to use a package you provide instead of the package the original package author requested (as long as the substituted package confirms to the package interface of the package being substituted), we achieve component visual harmonisation.

Re-Levelling Challenge

In visual harmonisation we had to map foo.code to bar.code, the name of component was the same, the package where the component was imported was substituted, this is easy to implement in ftd::Library::get() method.

But here was have to replace each instance of foo.h2 with an foo.h1, and foo.h3 with foo.h2 and so on. Problem is how foo.code is not going to change. So how do we find what are the promotion/demotion rules for any component in any package.

We can try to solve it by creating promotion rule entry in FPM.ftd:

-- package: foo

-- promotion-rules:
heading-2: heading-1
heading-3: heading-2
heading-4: heading-3
With these rules in place we are saying 1. only the components listed in promotion-rules are to be promoted or demoted, and 2. heading-2 can be promoted to heading-1 and demoted to heading-3 and so on.
TOC Sub-Tree Includes

This is the scenario that a package a table of content, and it wants to include multiple documents from another package in the table of content. The challenge in this case is “re-levelling” document “roles” like “Chapter” or “Section” and so on. In original package the subtree we want to include may be just a section, but in current package we may want to import it a chapter level.

If the documents know about their role, eg the title of a document was “section 1: introduction”, it will read wrong if it’s included as a chapter. Also chapter numbering must not be part of the document.

One thought for this is to define variables like fpm.role and fpm.role-number, and asking documents to use these variables instead of making assumptions about their roles etc.

Challenges In “Remixing” Text Content

22nd Dec 2021

Since Monday, I and Shobhit has been spending lot of time discussing how best to allow people to include content from package into another.

Things are easier with data. Say you have a package that has defined some list or variable or some record, you can easily import that package, and iterate through that list or use the variable, and create your custom visualisations etc. This part of FTD/FPM is working well already.

But when it comes to text content, it proving to be surprisingly hard.

Remixing Books: Documents from Dependency Packages

21st Dec 2021

When we are writing software we do not start from scratch, we take existing libraries and build on top of them. When writing on the other hand, say a book, we somehow live in the world where the assumption is every last word must be written or well its not a book.

It is not like we do not have open licensing available today for prose and creative writing. Creative Commons is out there. Wikipedia is successful project using Creative Commons. Github today has 147,082 Creative Commons repositories.

Even software documentation like Django documentation, is available in permissible license, even though it is not Creative Commons. Significant number of open source software ships with documentation, and usually that documentation is available in permissive licenses.

So we have huge amount of content available today. And yet we almost never see the content being “remixed”. Work being built on top of existing stuff. Everyone writing a new book on say Python feels they must start from scratch.

This I believe is bad for the world. We may think like we have a lot of books but we do not have enough books written. I once tried to write “Python for PHP Developers” and I got quite a good response on Reddit etc. We do not have enough intermediate level books.

But much worse than books, the real problem is ironically still too many books. Like say you are an intern and you want to join FifthTry, you will have to learn Rust, but then also Diesel, and PostgreSQL. And yes Django. And possibly even Elm. Some JavaScript. Some CSS. And of course enough about Linux, and AWS? And this is assuming they do not have to learn any FifthTry specific stuff. Which they still have to. What is FTD? How we deploy our services, how we do things and so forth.

Imagine there was a great book written about each of these topics, am I supposed to hand that many books to an intern who joins us? Maybe not all of them but it still many books.

How about if there was just one book? Think about it. Do we really need to learn whole of Django? Of course not, we only use Django for database modelling and migrations, so they must not actually read about django views and admin and template and so on. If you think about it telling an intern to learn any of them is itself really hard. How much of Rust should they know? How do you even talk about it? Every team that uses Rust uses a different aspect of Rust.

Should you learn about things that’s not directly pertinent to your job, the exact task? Of course learning is never bad, but we have only so much time, and if you not smart about it you may end up spending too much of your precious learning time learning things that may not be used for the job at the expense of other stuff that you will need. Knowledge acquisition simply can not be done in depth first fashion, since it’s a bottom less pit.

In summary we want people to be able to create books or dedicated websites for learning something, or compilation of information, knowledge, data etc, by using other packages as dependencies. This can allow “knowledge curators” to create learning paths for specific audiences.

Translation UI Design & Multi FTD Technique

20th Dec 2021

Me and Arpita worked further on the design of translation UI. Check out this commit to see the basic skeleton of our design.

The key idea is now we will have multiple base HTML files, with-fallback.html and with-message.html, which will be created with content of more than one FTD documents. For example with-message.html will render a “message document” and a “main document”. Further, individual documents can be shown and hidden using basic DOM style manipulation (display: none etc).

We are going to need a new feature in FTD, “message-host”, which will allow an event handler to message the “ftd host” (also in future we are going to rename FTD “Library” terminology to “FTD Host”).

Our with-fallback.html will come with a show_main() and show_fallback() host messages that our “message ftd” file will be able to trigger using $on-click$: message-host show-main etc.

We are quite happy with this design for now. I wrote couple of feature requests that can be implemented using this: “banner support” and encryption support.

Can Translation Package Use Tracking?

19th Dec 2021

A relatively strong consideration for designing translation, especially translation of versioned packages is that “translation can not alter logical content” rule. Consider 1.0/intro.txt vs 1.1/intro.txt. Consider a question: who decides if 1.1/intro.txt must exist? Original package only, or can translation package also decide this? Is it possible that in original we only have 1.0/intro.txt and no 1.1/intro.txt, which means as per the maintainers of original package, who supposedly are working closest with the actual software being translated, as per them there is no need for 1.1/intro.txt, that there is no software change between 1.0 and 1.1 such that intro.txt content changes. But can translation maintainers take overwrite this call and still decide to have 1.1/intro.txt?

What if the original 1.0/intro.txt had only so much content and translation team not only translated but decided to expand the intro and cover more topics in the intro? If they do then original and translated intro have logically diverged. It would then not be accurate to say that this is just a translation, it is not an independent work which was derived from the original.

I am not saying logical changes, derivative work shouldn’t be allowed, just that one should be aware when they are doing it. Some authors may be okay with their work being translated “faithfully” in other languages, but may not like their work editorialised or “improved upon” by others (and sold / marketed in the original author’s name). I as an author may not be able to read how you have altered my work in Chinese or some, so how can I as an author endorse it? I don’t even know what’s in it, maybe the change is minor but its really hard to say what is minor and what is not. A little thing the translator decided to omit as its too early to mention this feature, maybe the reason people would not buy that product because the author knew that that feature must be mentioned earlier as author knows the audience better than the translator.

Bottom line is there should be a clean line between translation and derivative work. And authors should be allowed to say they are okay with strict translation and may not have energy to whet the “editorialisations” and may not want derivative work.

So when we are talking about translation so far, we are talking about the stricter translation, no logical changes, as faithful as vagaries of natural languages allows one to be.

Let’s talk about tracking now. Tracking basically says “hey this document has changed, and you said that document depends on this document, so maybe consider updating that document”.

Now one can argue that tracking should not be allowed in translation packages. If you want to be strict, you maybe tempted to say the only document a translated document should be allowed to track is the corresponding document in original package.

By and large I agree with assessment. There is one exception tho: what if say you are also maintaining a “translated glossary”. This document may not exist in original, but in every translation one must maintain a document that lists what’s the translated term we are using for each interesting technical term in the original language. If we start from scratch, two different translators can pick two different ways to translate a term, and that may not be good for readers. So this document should be present, and one can argue every document in the translated package should be tracking the glossary file.

We are planning to implement a feature in ftd, which can make this consideration redundant though, do not put translated term in respective documents at all, but refer to them from a common glossary file via ftd markup references so if translation for a term changes all documents using that automatically get the latest version.

Where does it leave us? Disallow tracking in translation packages?

Translation Way Forward
17th Dec 2021

So Arpita has heroically built the entire translation feature almost. fpm translation-status is working. It looks for status of all documents in original package by reviewing content of .track and shows status for each file like “Never Marked”, “Update Date” etc.

TODO: Arpita to give me sample output of the command.

She has also implement fallback based build: if django-hi/intro.ftd doesn’t exist we look for django/intro.ftd.

Next: We will build fpm mark-upto-date to keep updating translation status.

We also discussed if django has N dependencies do we auto include them in django-hi if django-hi is a translation? Or should each translation mention each dependency explicitly? For translation we may want to use the translation of a dependency instead of the original package, and while doing so we would need to be able to rename and keep using the same name.

Next we also have to add language in fpm::Package, and ensure that key is set if current package is translation-of some “original package”. Original package must also have the language set and the languages must be different. Also translation-of can’t be “chained”, if this package is a translation of some other package then no other package can translate this package, everyone must translate the same package, each translation “cluster” has a single original translation package that every package in that cluster is translating.

Finally we need to come to UI. We have a few scenarios, say translation doesn’t exist for some document, in which case we have to show a message on top saying you are seeing the English version because this document is not yet translated in Hindi. Similarly we need UI for “never marked”, “out-of-date” and “upto date”.

We want people to be able to control these messages, these messages themselves must be translated. We will have files FPM/{missing, never-marked, out-of-date}.ftd, if they are not present we will have some fallback message. When creating a new translation using a future fpm create-translation command we will auto create these files maybe.

In never-marked and out-of-date scenarios we will have three FTD files to build, which will go in the generated .html file. The “message” file will show on the top of the page, and then one of the two, original and translation, would be shown and the “message” file will have a button with an $on-click$: call-js <function-name>, and we will have FPM.js file, which will have functions show-original and show-translation (and a “client variable: fpm.showing-original).

We will also need other server variables to be used when building “message” file with variables like original-language, translation-language, translation-status: "never marked" etc, original-marked-on-date, original-latest-date, original-diff etc.

Document-Name and Package Ambiguities

15th Dec 2021

When referring to a document, we use a document id. The document id for a document in current package is the relative path from package root. If we are referring to a document from another package, we use <package-name>:<document-name>, the colon separated path.

On its own if you see x, it can either be a document in current package, you may ask why no x.ftd? Are we allowing x/index.ftd to be referred to by x? No. Our index.ftd is only special at package root level, and not at individual folder level, we follow x.ftd and x/y.ftd convention and do not fall back to x/index.ftd, x/y.ftd as say node etc do. There are advantages to fall back, eg whole of /x/* is in a folder x, this leads to some simplicity, eg if you want to ignore that path, you have to no ignore x/ folder as well as x.ftd file.

We also have complications around import path in ftd documents vs file paths on command line. If you want to import x.ftd you write -- import: x, without the extension. Does that mean when you want to refer to x.ftd anywhere in FPM context it must be x? Like on command line if you want to add tracking on x.ftd or sync x.ftd do you use x or x.ftd?

We can have a rule, if only ftd file is meaningful in any context, eg in import statement, you can not import a .txt file, but you can track .txt file, so if only .ftd is allowed for something we use the path without .ftd extension and any file can be passed we use the extension.

We can also standardise on file vs document terminology. Document in FPM context strictly means it is a FTD file, where as file means any file. So function names, parameters etc can use print_file() or print_document() etc naming convention.

There is one more complexity. Say you have a logo.png file, there can also be a logo.png.ftd file which contains the meta data or web index of the logo.png. We can argue that this is too much complexity and lets have no such special convention. For example this causes another complexity as to what should be the URL of such logo.png.ftd file? The URL of logo.png is /logo.png if it is at package root, but for x.ftd the URL is /x/ so with this logic the URL of logo.png.ftd would be /logo.png/, and having different content based on presence or absence of a trailing / in URL seems to be a bad idea. Or maybe we do it based on “accept” header, if you request an HTML for /logo.png you get the content of logo.png.ftd and if you request image you get the .png. But then how would you link to the file? Say you want to open an image in a new tab?

TranslationStatus Output

14th Dec 2021

fpm cli currently is littered with println() and eprintln() calls all over the place. This works okay if we assume that output will always be on terminal, but we want to support JSON output as well. Also even if we did not want to, in general this is bad software design as if the code on how to show output is spread over the entire crate then its harder to modify it, test it etc.

What we should do instead if each of the commands that we have must have a corresponding output type define. Currently we return just a success indicator, and the return is returned as some custom error type, which in general sounds good but has no provision for warnings etc, say if you are building cp someone asked you to copy 10 files and you failed to copy the last file, should you only get the Err part, then you will have to stuff information about the files that were successfully parsed inside Err, or you say if even one file was successful then we return Ok.

This is just mis-using Result. We can use Result::Err to indicate something really bad has gone wrong, we could not proceed at all, say the command you tried to execute does not exist, or the config file is not found. But all other wrong, eg a given FTD file has bad syntax or could not be read due to permission issues, we should send all as Result::Ok.

Further we should have a type for each command which represents the output, which can be printed out on stdout with colour highlighting or converted to JSON etc. Also, main() should do the print to stdout or convert to json etc and not individual command handler functions.

Arpita Next Steps

13th Dec 2021

  • fpm sync <list of files file>
  • fpm status <file> to show all statuses for the file
  • fpm status to show status of adhoc tracking but not translation stuff
  • fpm stop-tracking
  • fpm diff <file> to show all diffs (including tracking diff, translation diff)
  • fpm diff -a to show everything for every file`
  • support tracking, sync, diff for any file, not just ftd files
  • allow any file type to track any file type
  • fpm translation-status
  • fpm translation-status <file or folder>
  • fpm mark-upto-date a --target b --on <timestamp>
  • fpm diff <optional-folder> to only show regular diff (only changes from .history file)

fpm::document refactor

12th Dec 2021

Been looking at code and found a struct fpm::document::Document. Not too happy with its design right now.

The idea is we have a lot of commands, and many of these commands go through all the documents in the current package, and fpm::document::Document and the corresponding get_documents() gives you a vector of these documents for easier time handling all files. The main job this get_documents() is abstracting over the directory walking so you do not have to worry about anything and you can just do a for d in get_documents(). Another thing get_documents() takes care of is ignoring documents that should not be handled, eg .history, .track, .build etc special folders.

I was thinking about what’s bothering so much about it. The first thought is abstraction and frameworks. I kind of am not fond of them, unless there is a strong reason. It like a trade, you want to have a nice Mercedes, sure, go ahead, you just have to pay for it. You cant just wish it. If you want to create a framework you have to understand all future needs. You have to educate people, it must be well documented. There must be abstraction boundaries. Everyone must understand what this framework is doing etc etc. People look at framework and go say I will also make one. But I will do all the work later. If you wonder where tech debt comes from, this is where it comes from. You went to a Mercedes showroom and drove off with a shiny car, and told them you will pay for it later. Live with your “bailgadi” instead.

So our bullock-cart should be std::path::PathBuf. Also instead of framework - a full on abstraction over file system, as that is what the fpm::document module is trying to be, an abstraction over file system so you do not have to deal with file system API - we should think of use cases and write helper methods.

Let’s see what all we need? We need to know if the given path is ftd document or markdown or some static file. Further want to know if some path is top level, in-fact there is only one such use case, we only do one special handling, the top level index.ftd maps to index.html, every other .ftd maps to <without-extension/index.html. Further during processing we want to report relative path of every document being processed, with respect to current directory.

Another thing we need is some sort of expectation if we are running from the package root directory or any random path inside it. We want to support the later case so none of the function should be written with respect to the assumption that they are running from package root directory. Either that or we assume everything is running from package root and we do a chdir at the start of the program. If we do change directory then we still need to keep track of which directory we were called from so that during reporting we show the relative paths.

We also need to be able to process the subset of a directory, either the current directory or a directory whose path was specified on the command line. So whatever directory walker we need, need that ability.

Further we want to sometime process files from another package, say in case of translation, we want to walk the original package and find corresponding files in current package. That too we would want to do starting from an arbitrary folder, eg the folder corresponding to current dir in the original package (eg if package root is foo and you are in currently in foo/a/b, and original is foo-en, then we want to traverse all files in .packages/foo-en/a/b/).

Contract 1: We will change into package root at app start
To simplify things our code can assume they are running from the right place. The config object will keep track of the original current directory.
Contract 2: We will always use PathBuf of Path in our program
All paths would be stored in the “Rustonic” way. We will have a method on config object that converts any path to original current directory relative string for display purpose.
Contract 3: We do not support non UTF-8 paths
In fact we would be using camino for storing paths, so we do not have to do .unwrap() when converting path to str everywhere.
Contract 4: If fpm::Config exists we can be sure its a proper package

We only create fpm::Config after verifying the package is proper (it may not have all the valid ftd files etc, but its FPM.ftd exists and is valid, and its dependencies have been checked).

We generally want to make as many such guarantee, if there is ever a constructor() -> Result<X>, since constructor() is clearly saying its a Result<> meaning it could have failed, then rest of the code must be able to assume that if it was possible to check validity of X it must have been checked by constructor() and they can blindly trust the output of constructor(). The only exception to this is “expensive check scenarios” where its not feasible for constructor() to check everything as that check would be too expensive and detrimental to performance of the program. Note that in such cases the documentation of constructor() must say so, what checks have been left out, and there must be some X.deep_check() or something that will perform the rest of the expensive checks`.

Contract 5: Temporary files are created in /tmp etc
We do not litter current directory with partial and temporary folder, like zip file being downloaded when package is being fetched.
Contract 6: We never go beyond the package root
We ensure this by always creating file paths using config.assert_in_package(), which panics if we go beyond the package root. It must not be relied upon to be the only thing, this is the last resort, all best practices should be used when creating paths.
fpm fmt
Markdown Index Support and 0.1.5 Release

10th Dec 2021

We already supported the conversion of markdown files to HTML during the build process. Shobhit has further added “markdown index” support, any directory that says a with README.md in it, without a corresponding .ftd file, we render the content of the README.md at /a/.

Vipin needed font support, realised we have not released anything for a while so created a new release.

Package Interface and Package Alias

Was discussing package interfaces, and we have to start creating some of these package interfaces now, now that package and basic machinery is working fine.

The motivation for package interface, and aliases, is to make it easy to change themes and “component libraries”, without doing much change in your package.

When using a theme or a component library you have to first add them as a dependency in your FPM.ftd:

-- import: fpm

-- fpm.package: foo

-- fpm.dependency: some-lib

If we just do this, we can now import some-lib from any file in the foo package.

But what if we want to change that some-lib? First of all, we someone has to ensure that the new library is compatible with some-lib, else switching would be a tedious process, we will have to update all calls of some-lib with a compatible component or type in the new-lib. What guarantee is there that new-lib will have corresponding types and components (with same arguments, type and semantics?).

Say some-lib has exposed a component heading for authoring headings in documents:

using heading from some-lib
-- import: some-lib

-- some-lib.heading: the heading
level: 1
Now when we switch to new-lib, maybe they have decided to expose heading with the name h1, so we have to do the following:
-- import: new-lib

-- new-lib.h1: the heading
Notice the changes: we have updated the import command, then we updated the call module name part of each reference some-lib to new-lib and finally we have switched from heading to h1 and removed the level.
Using import as
We could have avoided the second by using import as feature:
-- import: some-lib as lib

-- lib.heading: the heading
level: 1
In this case, if we have saved the number of edits we have to do when switching component library, and would have to only focus on the incompatibility between the two libraries. If the two were compatible we would only have to change one line in each ftd file when switching a library.
Using Package Alias in FPM.ftd
We can do better by supporting package alias:
-- import: fpm

-- fpm.package: foo

-- fpm.dependency: some-lib as lib
Here we have aliased some-lib to lib in FPM.ftd itself, so all the ftd files in the package can just import lib now, and we have to do that much fewer change when switching:
-- import: lib

-- lib.heading: the heading
level: 1

If the packages were compatible, had same components and types, then we can switch packages by modifying only one line in the entire package!

But how do we ensure more and more packages are compatible with each other like this so it’s easy for people to switch?

Package Interfaces

What if there was a way for a package to say it has exactly the same types and components exposed as some other package? What if any package can act as an “interface”, and any other package can “implement” it?

We actually do not want to do anything special to declare that some package is an interface. Every package is an interface. Just that they are not well known, others are not going to “implement” that package. But any package can become popular and others can start targeting that package.

When you are creating a package and implementing an interface, for that matter a package should be able to implement more than one interface if they happen to be mutually non-conflicting. Two packages would be “conflicting” if say both exposed a component “heading”, but they take different parameters. Or maybe one uses heading as a type (say a record name or variable name) and the other has a component named heading.

Now imagine there was some well-known package, fifthtry/blog for example, and then a package can declare that they implement fifthtry/blog by using implements: package key:

-- import: fpm

-- package: some-lib
implements: fifthtry/blog

And say new-lib also has declared that they implement the same. Now our fpm check (and fpm build etc) will check if the contract is indeed valid, if indeed some-lib has exactly the same types and components with compatible names and arguments as fifthtry/blog, else everything fails.

In fact providing extra components or types, or even extra optional properties to records or extra optional parameters to components is also fine, our interface checker would not mind that.

Now when someone is using some-lib they can do the following:

-- import: fpm

-- fpm.package: foo

-- fpm.dependency: some-lib
interface: fifthtry/blog

Here we have said please use some-lib, but do not give let me use any item from some-lib, only the items that are defined in fifthtry/blog. We can then import fifthtry/blog from any of our ftd files, and our system will know you really want some-lib stuff, so fifthtry/blog has become an alias for some-lib. With this, switching to new-lib would be guaranteed to work with just one line change.

Finally, we can use both interface and as, say if the name of the interface package is long etc, you can also use alias.

-- import: fpm

-- fpm.package: foo

-- fpm.dependency: some-lib as blog
interface: fifthtry/blog

Now you can do -- import: blog everywhere. There you go, one-line change theme switching with guarantees.

We, FifthTry, will seed things with enough packages that act as interfaces, we will study different domains, resume, photo album/journal, podcast, blog, pricing page, contact page, testimonials, simple product listing, book review site, books, api, etc etc, and propose the first set of package interfaces and seed the first set of professional quality themes for each interface. Once the ball is rolling, the world will take over.

Thoughts: Tangled Documentation
Let’s talk about software documentation for a moment. Most programming languages have the provision to write documentation as part of code. Consider this code:
def lower(x):
    "lower() returns the lower cased version of the passed value"""
    x.lower()
Here the string “lower() returns…” is documentation of lower() function, and it serves two main purposes: 1. it shows up in IDE:
IDE showing the documentation

And 2. it shows up in pydoc etc generated documentation of the software package. Eg consider this page.

Java, Rust, many languages have some features like this.

Problem 1: Editing Is Hard
If we put documentation in source code, we are basically limiting editing of that part of the documentation. Not only one has to learn git etc before they can contribute, but the lint etc also has to pass. Further, each language has its own weirdness you have to learn.
Problem 2: Discourages Long Form Documentation

If a project chooses exclusive source-based documentation, it becomes hard to include guides, tutorials, overviews etc in the docs. People tend to throw all that in an assortment of scattered README and other markdown files.

Projects that use sphinx for example do a better job at it, where the documentation files drive the extraction of documentation from source files. This allows people to structure documentation properly.

Internet is full of tutorials and how-tos that should have been part of original documentation, but because of the high bar on source code.

Problem 3: Translation is impossible
If you are putting documentation along with source code you can not include docs in 10-20 languages now can you? It is simply impossible to have the translation of source code documentation based on the current state of tooling.
Problem 4: Versioning

It is virtually not possible to update the documentation of any existing published version of the software. Documentation change is clubbed with software change and has to through a software release cycle.

I discussed other problems with documentation and versioning on 9th Dec 2021.

So What Should Be Done: Tangle And Weave

Like we have language server protocol, LSP, supported today by many editors, we must have some sort of language documentation protocol. Not protocol maybe, as that implies a network protocol, we should have some standard file system, package format or whatever, for where to get documentation from.

So say we create fifthtry/pydocs package interface for documenting python packages, or maybe even more specific fifthtry/python-cli vs python/django-app etc. And so on for all programming languages.

We then create tooling to extract source docs from python etc packages and update the docs. So you have your python source checked out in a folder, and next to it you have the docs checked out, and there is a tool that extracts all source docs from python and updates the docs. And then there is another tool that does the reverse, takes the docs from the docs folder and updates the python files.

This is sort of literate programming in reverse. You edit docs in source files and extract docs out of it, and then modify the extracted docs and put them back in. Javadocs etc do only one step, take out the docs from the source, but they do not put things back in.

Patching pip/cargo etc

If we set this up, we will then have to patch pip, cargo etc, things that download source code to fetch the source, then fetch the documentation package in the users preferred language, and update the source with documentation so IDE can show docs in the right language.

Of course, you would want to do this only in development builds, in a production environment you do not need to do the extra work, nor do you want to mess with line numbers in source code so all error stack traces are consistent.

This is basically a hack. Ideally, editors should be updated to support the documentation package format itself and show tooltips docs from DPF files instead of the source file. During development, we do use jump to definition, and often scroll to read the docs, so patching source is also needed.

Arpita Next Steps

9th Dec 2021

  • fpm sync <list of files file>
  • fpm status <file> to show all statuses for the file
  • fpm status to show status of adhoc tracking but not translation stuff
  • fpm translation-status
  • fpm translation-status <file or folder>
  • fpm stop-tracking
  • fpm mark-upto-date a --target b --on <timestamp>
  • fpm diff <optional-folder> to only show regular diff (only changes from .history file)
  • fpm diff <file> to show all diffs (including tracking diff, translation diff)
  • fpm diff -a to show everything for every file`

Ad hoc Tracking Is Here!
Arpita has finished implementing the first version of fpm start-tracking and fpm mark-upto-date commands, and updated fpm status and fpm diff to support ad-hoc tracking.
Notes on Deleting Files And History

We won’t delete the history of a file when a file is deleted. We currently have .history/foo.<timestamp>.ftd for every modification of foo.ftd. We will soon also have .history/foo.<timestamp>.ftd.info.

In some cases only the .info file may exist, when foo.ftd has been deleted.

.info file will also contain the “commit message” of a change. And in future when we implement history browsing it will also contain a flag that tells the UI to ignore this particular change saying it’s a minor thing and is distracting. We don’t actually delete the timestamp file and merge because timestamps are created on sync and after sync it gets published, and others may have stored that timestamp in their .track file.

Technically, we can manage to miss .timestamp files from a .track file by using the immediate previous entry, but it will cause a double review in edge cases. I prefer we never delete the history.

.latest.ftd
What about .history/.latest.ftd file? When a file gets deleted we should not delete the entry for that file from the .latest.ftd file, but instead add a deleted: true flag to it. If by the same name another file gets created, we will remove the deleted: true flag.
Auto Stop Tracking If File Is Deleted?

If a is tracking b and b is deleted, we could remove the tracking information from a.track. This means if a new file b is created in future, the user will have again add-track it.

Here we are assuming if a file is deleted and another file with the same name is created later, the two files are logically not related at all and just happen to share the name.

We can also show users a warning when they do fpm status etc, saying a file is gone and prompt them to stop-tracking explicitly.

Multiple SQLite DB Support
$processor$: package-query only supports querying against one sqlite database at a time. SQLite supports loading more than one sqlite file at a time and we should be able to support that as well:
-- string foo:
$processor$: package-query
db foo: foo.sqlite
db bar: path/to/bar.sqlite

SELECT * FROM [foo.table1], [bar.table2]
We can use the db aliases in the query.
Thoughts: SQLite And Build Optimisation

One of the issues with SQLite is we may not be able to do an optimisation.

When things are rendered via fpm-repo things are fine, but when using fpm build, things can become a problem if the package size or complexity grows. With support for HTTP request handling and SQLite queries, and we will be adding a lot of stuff, we are just getting started, the time it takes fpm build may start becoming an issue.

One way to speed things could be to do build caching, do not rebuild something that has no change. Consider the simple case, a.ftd, which does not import any other document. Consider also we do not have dynamic features like fpm.now which returns the current time. In such cases, we can choose to not rebuild a.ftd if the content of a.ftd hasn’t changed.

We can detect if a document uses another document or dynamic features, and we can store this information, and use it when to decide if we should rebuild things or not. Eg, if we know some document has used fpm.now and has no other dependency and that document’s content hasn’t changed, we still want to regenerate the corresponding html, as fpm.now returns a different value on every run.

We can do this optimisation for sqlite as well if a random sqlite file has been used. If the sqlite database file has changed then we need to rebuild the document that queries that sqlite database even if the document itself hasn’t changed.

But when we talk about query against package database, a sqlite database that we would be creating/updating based on the content of all documents in an fpm package, that sqlite will always change, and any document that queries that package database can never be cached.

Unless we break the package database into smaller databases, say for each record we have a sqlite file, and some document is only interested in instances of that record, and we know that only a few files have created an instance of that record, now the out of date graph would be a lot smaller, only those few files.

Smaller files have problems tho, some queries would want to use join and if the tables in different database join may not always be possible (not sure how things would be after the support of referring multiple sqlite databases for a single query, can it negate this drawback?). Not sure how foreign key would work etc.

One thing is there that we can do both, generate big databases and individual databases as well, and do it opportunistically, only generate the databases that were referred by any of the documents. This way we can even choose to not generate the big package database if no document is querying that database.

How about a database for use for other packages? Say package a wants to query database of package b? That is slightly simpler because then you are not worrying about build time but sync or publish time. It may be okay to do more work during publishing.

But then again why are we calling build all the time? Why are we worried about performance of the build at all? Because we currently do not support auto reload?

Auto reload means we have to support the demand generation of pages, in which case we don’t have to worry about caching, the moment we introduce support for fpm server, even without auto reload feature, this whole discussion would go moot.

Making fpm build fast would be important, but then we can ignore it as it’s no longer urgent, as so far my concern was author edit workflow, which would be improve much more by auto reload and fpm serve than making fpm build fast.

Merged: Static File Support Is Here!

Till now fpm build used to only process .ftd files in an fpm package. Shobhit has added support for handling static files (files other than ftd and markdown, and not ignored by .gitignore/fpm.ignore etc).

Now we can add images etc as part of an FPM package and they get included in generated static site.

Since we kind of now include all files in the current directory, it may be a good idea to review the files that are being processed and ignore files you do not want published, both for performance and security reasons.

Thoughts: Security

One kind of thing we have to be constantly aware of when writing/reviewing any FPM code is accidentally copying say /etc/passwd to the built file. This can happen say if we started support symlinks, and someone has a package that you are contributing to, and symlink gets you to accidentally copy more stuff to .build and publish to the world.

It can happen more than by just symlink, say in $processor$: package-query, if the value of db is set to say ../firefox-password-store.sqlite file, again we may accidentally read beyond what we should, which is the content of the current package or any package that is a direct dependency of this package. In fact, if current package depends and a package a and a depends on b, then we must not allow current package files to directly reference files in b without adding an explicit dependency on b, as while this is technically not a security issue, it causes issues if a removes the dependency, the package will suddenly stop building.

We are shipping a binary that runs on people’s machines, this is a threat vector, and while its easy to audit code for direct threats (attempt to do mining, or scanning stuff), we are also letting our binary download packages from the internet authored by others, and we have to give guarantees that those files can not do any harm either.

Our FTD must not get any feature, for instance, letting people directly execute commands from a package, without user understanding the command (this can happen when a package is being downloaded for the first time, we should show all the external commands this fpm package is going to be able to run etc).

The thing is this is very tricky, we can not say we don’t want to let package authors run anything, that will significantly decrease the power of FPM packages. We have to allow people to query data from their trusted data stores, to make it part of generated documents, if they want to. But we can not surprise people. Even HTTP requests are sensitive, not just command execution.

Markdown Special Variables

  • fpm.markdown-filename etc
  • fpm.package.name vs FPM.package-specific-variable
  • import path disambiguation, can be use : as separator
  • import: <package>: is ugly, import foo means foo is either in current package or a package which is a direct dependency of current package

  1. a (local), a: (package index), a:b/c (-- import: a: is ugly)
  2. a (local or package-index), a:b/c (pretty)
  3. @a (local), a, a:b/c (explicit)
  4. a (local), @a, b/c@a

generalised import
-- import: a as b
from: <package>
exposing: x
exposing: y

-- import: x as y
Here, x is either a document in current package or a package.
fpm and markdown
-- import: vars
from: fpm

-- import: fpm
What if we have fpm.ftd in current package, who to import it?
fpm and markdown
-- import: fpm
from: .
How do we handle code include?
Currently when using ftd.code we include the code block in the body. What if we wanted to refer to code in a source file?
-- ft.code:
lang: py
$processor$: include-source
file: a.py
tag: foo
a.py
import os

# FPM-START: foo
def foo():
    print("this is foo")
# FPM-END: foo

Currently ftd $processor$ can only be used for variables and lists, we have to extend it to use on component construction as well.

We may also want to specify more than just the body as well, eg other headers.

fpm tracks and fpm mark is here!

Arpita implemented these two commands.

We had a funky fpm a tracks b idea, which no command line tool does. She used the fpm tracks destination source syntax. We should make it maybe: fpm start-tracking <f1> --target <f2>. Also the code assumes we will only track ftd files, but we have to be able to track any file. Remove .ftd prefix handling.

fpm mark-upto-date <f1> --target <f2>. Here target would be optional eventually, if the target was missing and f1 tracks more than one file then the cli would prompt which file.

If the file is not found it silently returns, should report the error. If file is already being tracked it should show some message.

Not sure if fpm diff should show all the diffs, local modifications and tracking diff, or should we have a separate command.

Explanation Of Dynamic Documents and Actions

8th Dec 2021

Recorded a design review of what I am calling the upcoming “FPM Web Framework”.

100th Commit! We now honour .gitignore etc
Shobhit just implemented support ignoring files that git ignores. You can also ignore more files by populating fpm.ignore: string list with patterns that are compatible with .gitignore file patterns.
Some thoughts on Versioning
FPM is for storing documentation and not software. Git is for storing software and people also use it for storing documentation.
Problem With The Way Git / SVN etc Do Versioning

There is a problem with the way Git etc manages versioning that it is not a good solution for storing documentation, in my opinion. Traditional “source code managers” create a new folder or a branch for managing versioning.

Say you are working on django 1.0 and its time to create django 2.0, you create a new branch for version 2.0 and start working on it.

At this point, the two branches have diverged. All new changes now go to the 2.0 branch. You may even make main or master follow the 2.0 branch. For all practical reasons, 1.0 has stopped. You will want to be good citizens and support 1.0, backport some of the bug fixes, especially security fixes. In projects like Linux people even backport whole features to older versions.

But back-porting is kind of extra work. Most people recommend and send a pull request against the latest main/master and the changes get included in the next release that is cut from main/master.

Documentation and Software Are Different

Back-porting bug fixes and features are kind of hard. You have to verify software indeed works correctly. You have to write unit tests and verify there are no regressions. Software releases and continuous maintenance is work. It’s kind of usually best to focus efforts on the latest version and update as soon as one can.

Documentation on the other hand is lot easier. There is no risk of documentation breaking anything. Wrong documentation can still have consequences, no doubt about it, but they are rather rare, compared to software issues. Software is a lot more fragile compared to documentation.

By managing documentation like the software we are doing a dis-service. It’s the wrong tool for the job.

One difference in documentation compared to software is that a lot of changes in documentation are not logical. Eg, if the original documentation said this function takes three arguments and the new one says it takes four, this is a logical difference. But if the original documentation was too terse, and new documentation gives more examples and explanations and how-tos, the documentation hasn’t really changed in a logical sense.

It’s like refactor in software. We know refactor, should not cause any change in behaviour of the software. But in software it’s really hard to arrive at this conclusion, if someone sends a software refactor PR, it’s really hard to say that will really have no effect, that it’s safe to merge.

But for documentation its almost trivial to see a PR and conclude that there is no real “logical” change, that it is just a better written tutorial and has not changed anything fundamental.

Updating Documentation For A Published Version Of Software Must Be Possible
Currently, it’s almost not possible to update django 1.0 documentation without also creating a new release of django 1.0. And in that case, it no longer is updating the documentation of django 1.0, it’s updating documentation of django 1.0.1 or whatever.
Open Source Maintainers Are Busy Enough

Software is a lot of work. Core maintainers have to review every line like a hawk. To ensure nothing is broken. That there is no edge case. No violation of any contract etc.

So sending documentation only PR to review to them, especially to older versions of the software, it’s almost going a bit too far, demanding too much from them.

The Root Problem In Current Documentation Workflow

If we create a new branch for every new version of the software and keep the documentation along with software in that branch, we are basically creating a copy of every file for every version.

Say you have a file tutorial.txt, there is one copy of this file in every version branch. This is the root problem. Even if there is no change in tutorial.txt, we still copy it over.

We do, what can be called, copy-on-branch, instead of what should be: copy-on-incompatible-change.

Let’s analyse django:

git clone git@github.com:django/django.git
cd django
python t.py
4.0rc1 37 72 27
wc -l tags
     312 tags
Show t.py

As you can see, we have 312 tags, and therefore 312 copies of overview.txt, tutorial01.txt and install.txt but only 37, 72, and 27 unique content.

If tutorial01.txt is modified 72 times, we have to ask how many of those 72 “logical change”, meaning the tutorial was changed because something in django changed, say some api was added or changed etc, and how many were “English change”, improving the doc.

If we do the analysis, we can bring down the number of unique copies of tutorial01.txt to even fewer than 72, maybe only 3 or 4 as django is pretty stable and the tutorial for django 2.0 is still applicable for django 4. Lets take a look at the diff between django 3.0 and 4.0:

--- 3.0.txt	2021-12-08 19:33:26.000000000 +0530
+++ 4.0.txt	2021-12-08 19:31:37.000000000 +0530
@@ -23,7 +23,7 @@
 If Django is installed, you should see the version of your installation. If it
 isn't, you'll get an error telling "No module named django".

-This tutorial is written for Django |version|, which supports Python 3.6 and
+This tutorial is written for Django |version|, which supports Python 3.8 and
 later. If the Django version doesn't match, you can refer to the tutorial for
 your version of Django by using the version switcher at the bottom right corner
 of this page, or update Django to the newest version. If you're using an older
@@ -35,10 +35,8 @@
This is technically a logical change. Though ideally this diff must not have existed at all, they should have defined |minimum-supported-python-version| like they have defined |version|. As much as possible documentation must used variables etc, so they do not go out of date so easily or create such needless diffs for humans to review.
-    If you're having trouble going through this tutorial, please post a message
-    to |django-users| or drop by `#django on irc.freenode.net
-    <irc://irc.freenode.net/django>`_ to chat with other Django users who might
-    be able to help.
+    If you're having trouble going through this tutorial, please head over to
+    the :doc:`Getting Help</faq/help>` section of the FAQ.
Seems that they have stopped supporting the IRC. Now this must not have been a django 4.0 change, I mean they are not supporting IRC for any version of django, this must have been a change in django 3.0 docs as well. Imagine how many people would be using django 3.0 for so long, years and years, and they will all be going to IRC and they would be failing to get help, wasting time. Why? All because they decide to create a new branch for versioning and not what they should have done as I will describe later.
     If your background is in plain old PHP (with no use of modern frameworks),
-    you're probably used to putting code under the Web server's document root
+    you're probably used to putting code under the web server's document root
     (in a place such as ``/var/www``). With Django, you don't do that. It's
-    not a good idea to put any of this Python code within your Web server's
+    not a good idea to put any of this Python code within your web server's
     document root, because it risks the possibility that people may be able
-    to view your code over the Web. That's not good for security.
+    to view your code over the web. That's not good for security.
Again, as you can see, absolutely no reason this is a change django 3.0 users should not see, this is completely generic improvement applicable to all django versions.
@@ -86,6 +84,7 @@
             __init__.py
             settings.py
             urls.py
+            asgi.py
             wsgi.py

 These files are:
@@ -113,6 +112,9 @@
   "table of contents" of your Django-powered site. You can read more about
   URLs in :doc:`/topics/http/urls`.

+* :file:`mysite/asgi.py`: An entry-point for ASGI-compatible web servers to
+  serve your project. See :doc:`/howto/deployment/asgi/index` for more details.
+
 * :file:`mysite/wsgi.py`: An entry-point for WSGI-compatible web servers to
   serve your project. See :doc:`/howto/deployment/wsgi/index` for more details.
Finally some real change that should exist. Sometime between 3 and 4 they seem to have added asgi.py and this change must not have been back-ported to django 3.0 docs.
@@ -146,16 +148,16 @@
     Ignore the warning about unapplied database migrations for now; we'll deal
     with the database shortly.

-You've started the Django development server, a lightweight Web server written
+You've started the Django development server, a lightweight web server written
 purely in Python. We've included this with Django so you can develop things
 rapidly, without having to deal with configuring a production server -- such as
 Apache -- until you're ready for production.

 Now's a good time to note: **don't** use this server in anything resembling a
 production environment. It's intended only for use while developing. (We're in
-the business of making Web frameworks, not Web servers.)
+the business of making web frameworks, not web servers.)

-Now that the server's running, visit http://127.0.0.1:8000/ with your Web
+Now that the server's running, visit http://127.0.0.1:8000/ with your web
 browser. You'll see a "Congratulations!" page, with a rocket taking off.
 It worked!

@@ -204,16 +206,16 @@

 .. admonition:: Projects vs. apps

-    What's the difference between a project and an app? An app is a Web
-    application that does something -- e.g., a Weblog system, a database of
+    What's the difference between a project and an app? An app is a web
+    application that does something -- e.g., a blog system, a database of
     public records or a small poll app. A project is a collection of
     configuration and apps for a particular website. A project can contain
     multiple apps. An app can be in multiple projects.

 Your apps can live anywhere on your :ref:`Python path <tut-searchpath>`. In
-this tutorial, we'll create our poll app right next to your :file:`manage.py`
-file so that it can be imported as its own top-level module, rather than a
-submodule of ``mysite``.
+this tutorial, we'll create our poll app in the same directory as your
+:file:`manage.py` file so that it can be imported as its own top-level module,
+rather than a submodule of ``mysite``.

 To create your app, make sure you're in the same directory as :file:`manage.py`
 and type this command:

Again, all these changes must have gone to 3.0, or even 1.0 for that matter. As you can see, of all the changes, only the asgi.py thing was actually a difference between django 3 and django 4, and that must have been the only diff we should have seen. Every other change should have been back ported to the earliest version.

But they can not because of the way they manage versioning.

So How Should Version Be Maintained?

Instead of creating a copy of each file by branching (ah yes, sure, git internally doesn’t copy, but that’s a git’s internal detail, from human operator/user perspective it is a copy), we must have a single folder with all versions, one folder for each version. But when creating a new version, we must not copy content of the current version, we must create an empty folder and rely on fallback logic at documentation build time to fill in the files from the last versions.

For this we create a sparse tree, each version folder only contains a file if that file is logically divergent from the corresponding file in the previous version. One can even store just the diff, and maybe that is what we will end up doing, but from data structure perspective thinking as if we have only stored the diff is the right way to think about what I am suggesting.

With this requirement to only copy when there is a logical change, the number of copies would go down, and humans can easily review if the change is indeed logical. For that matter, humans may also realise that the change in the software itself can be tweaked so this logical change is not needed (I mean if you are changing something you are forcing every user of your library to learn something new, and if with a small amount of effort you can ensure that no change is needed, no need for hundreds and thousands of users to learn something new, isn’t that small effort worth it? And it’s not just users you are helping, if you have to maintain one less diverged file, it will reduce the overall work you have to do in the lifetime of the project while maintaining the highest quality documentation).

SQLite Support!

7th Dec 2021

Yesterday I started implementing SQLite support for FTD documents via $processor$ mechanism of ftd. It was stuck on some missing helper utilities in FTD, you can see the two functions at the end with todo!() as their body.

Today Arpita is feeling a bit better from the cold, and she has created PR implementing those functions in ftd. She can’t not work even when down it seems!

On that change landed I got the $processor$: package-query working. Update the documentation as well.

During implementation, I found out that serde-json support for rusqlite is actually, umm, not sure what word to use, not buggy per se, but kind of not as useful as I first thought it would be. It converts only the text field to serde_json::Value, when other fields like integer, float etc can also be easily converted to serde_json::Value. I implemented a simple converter for other sqlite data types as well.

We are not done yet. We need the ability to pass parameters to queries. Also, those parameters should be able to refer to other ftd variables. And finally, we should be able to read into more types, currently, we only read into records and a list of records.

Some thoughts on Translation Support

So we have reasonable sync/status/diff working now. We are going to soon implement “ad hoc tracking”, basically a .track folder that contains foo.track like file for each foo.ftd file. status/diff will look into the .track file for any file and see if they are up to date with respect to the files being tracked.

Once we have ad-hoc tracking, we are going to work on translation tracking, where a package would be able to designate itself to be a translation of another package using translation-of key in FPM.ftd. When a package tracks another package like this it means two things.

Every File Tracks Corresponding File

All files of the translation package are assumed to be tracking the corresponding file for the original package being translated.

In ad-hoc tracking, you have to explicitly add tracking information using planned fpm a tracks b command. But with translation, tracking relationships are implicitly assumed.

Missing Files and Fallback

It also means that if a file is absent in the translation package, the corresponding file from the original package is built.

So to create a translation of another package, you can create an empty package, and all files of original would be the content of the translation package. And then you can start translating one file at a time by copying over just that file, so you do not have to litter your package with files that are just plain copies.

Out Of Date File’s HTML

So we have translation tracking, which means we know if a file is up to date with the corresponding file in original package, translation is complete for that file, but what if it’s not up to date? Or even it was never finished? What if you just created the translation file, have only translated the heading, what should we show in those cases?

So far, given a file a.ftd, we convert it to a.html and that’s it. But now for a file translated:a.ftd we have original:a.ftd. So we have to consider two files when generating a.html. In simple case, where translated:a.ftd is “up to date”, we render translated:a.ftd and that is that. But if translated:a.ftd is missing or not up to date, we have to include the content of both (and two potential diffs, diff of original:a.ftd from the date translated:a.ftd was last marked up to date by a human being, to the latest original:a.ftd, and diff of translated:a.ftd when and if it was marked updated date last to latest).

So we have two ftd files and two diffs, and possibly some metadata to include in the generated a.html. We also need some JS to let people switch between potentially out of date or incomplete translation and the original version. And we need an area to show some out of date warnings, and some UI to show/hide the two diffs.

User’s Area vs Special Area?

Now that we have more than the content of an ftd file to render in any generated .html file, we have to ask ourselves if we should split our UI into two logical area, an area completely controlled by end user using ftd, and another for the special stuff, like translation out of date messaging?

Tomorrow we will have more stuff, like when we add versioning support, we would want to show “you are looking at the old version, jump to the latest version” eg docs.rs etc, but then if you are seeing a specific version, you want to see two diffs as well: diff of what changed in this precise release, so diff from the last version to this version, and diff of this version with the latest version. It would be good if these two bits of info were available at a fingertip without doing complex git diff incantations.

For both versioning and translation, we need to show some sort of “switcher”, go to another language or version of this document.

We can do these by special variables. We can expect each “theme” to know about these special variables, and the theme author can include the UI in the most logical place.

We can also create a dedicated “document info” area, say a banner at the top, and then theme does not need to bother with all that.

We can even let people customise say FPM/document-info.ftd file whose content would be included in the special area.

And finally, we can let these writers specify that they have indeed included the document info in the theme itself and fpm should not show FPM/document-info.ftd, maybe we inherit FPM/document-info from the theme and let the theme set it to an empty file?

There is a little complexity when an FTD document basically says unload me and load some other document. We do not have an FTD way to do this yet. Not sure if it’s a good idea.

The thing is this is a lot of complexity and a lot of potential changes, at least in the early days, leaving it for each theme to individually manage would be a bad idea as then the confidence with which you can switch theme will go down as you will have to verify more things, and not just the look and feel of a theme.

Started Design

6th Dec 2021

Jay has created a logo that we kind of like, and some initial design for our homepage:

This is still under works.
SQLite Support

Since we are anyways going to implement some sqlite support for package database, we may as well let people include arbitrary sqlite files in the package, and let people do arbitrary SQL queries on top of any of the sqlite databases in the package or any package that is a dependency of this package.

Change Request.

Partially implemented $processor$: package-query. Need two methods in ftd crate, waiting for Arpita to get better and implement them.

Dynamic Documents and FPM Action

Recently I implemented $processor$: http, which paves the way for almost fully fledged “dynamic”/data drive documents. To really expose this feature, we can implement arbitrary URL handing, currently only the URLs corresponding to actual FTD files is available, this is too limiting as we can not have author page for example, where we need a URL for each author, and author list maybe in a table, or spread all over ftd documents.

With a FPM/URLs.ftd, we can define URL patterns that can contain dynamic data, eg /<username>/ etc. We can map each URL pattern to a FTD file, and the file can extract the data by defining a schema (eg a record), where the fields of record would be populated by URL query parameters, URL fragments (eg username) here, or even request body if its available.

We will sometime need redirect etc, eg if you need login to access something, or if the page has moved etc. So we need a mechanism to early return, currently its not possible in a FTD document to let a document say we are done processing the page. At least we can have some standard variable, eg fpm.redirect-url, which when set, FPM will ignore the document and return the content of fpm.redirect-url instead, along with some instruction to fpm.js to update the URL displayed in browser.

This is almost the design of realm. Once I wrote that I realised we also need an equivalent of “realm actions”, and turned out the design is really easy. Wrote about it in another CR: fpm actions.

The key element of actions is now we need a fpm.js, which will have some behaviour, and this paves the way for fpm client variables.

Client Variables

Some variables can only be computed by client. So we need to ship a fpm.js, which will compute them, and pass them to FTD. Some of these variables would also change during the lifetime of the page.

fpm.js will also do some browser history control, for managing server side “internal redirect”, and fpm actions.

fpm sync, fpm status and fpm diff

5th Dec 2021

Arpita has been on fire, implemented these three commands. This makes fpm puts some “version control” features in fpm. Our goal is not become simplified git, we definitely take a lot of inspiration from them, but more like advance Dropbox/iCloud.

fpm repo is needed before we can really call ourselves version control or compare ourselves with Dropbox/iCloud etc. But if you are using Dropbox/iCloud, you can already benefit from these commands.

Our goal now is to implement adhoc tracking, and get to translation tracking.

$processor$: toc and $processor$: http

5th Dec 2021

We implemented these two processors. TOC processor helps you when you are trying to create navigation bars, or table of content of your FPM site (this is not the intra page TOC, which we still do not know how to implement).

$processor$: http lets you fetch data from any HTTP endpoint that returns JSON when rendering a FTD document. It opens up massive opportunities, you can fetch data from all sorts of places, data driven graphs and tables etc can be created.

Font Support

4th Dec 2021

We just added font support. This allows people to use custom fonts with FTD/FPM.

You can specify the fonts you want loaded in FPM.ftd file.

Package Dependency Fetching

3rd Dec 2021

Shobhit added support for adding package dependency, and fetching dependencies from Github or any tar file. We store the dependencies in .packages.

Lots still pending, recursive package dependency, or any way to update a package, looking for packages in FPM_HOME, but its exciting to have this feature land so soon.

Package Database

1st Dec 2021

Wrote about the design of package database, basically a sqlite database that we will maintain in the initial phase of fpm build, and a way to query that database to get data stored in different ftd documents that are part of the FPM package.

This can make fpm packages the ideal data sharing format. You can have data authored in FTD, and packaged as FPM package, along with some UI code, and publish them to fpm repo. And then other packages can have a dependency on the original data package, fpm will download the data, and they can do something more with it, create more visualisations, augment data even more using $processors$ etc, and maybe publish that package again for other to consume.

Update for 30th Nov 2021
Shobhit is working full time on FPM now. We managed to nail down quite a bit of our design now.
fpm build working
Finally we can start using fpm. We have created a template repo: FifthTry/fpm-blog, rendered version, which shows how to create a blog using FTD and FPM.
Design Review With Arpita
Me and Arpita recorded a detailed design review of FPM.
Kickoff

25th Nov 2021

We have started working on this project now. We have written design docs for quite a few elements:

Shobhit is taking the lead of implementing this. We are hoping to create a minimal version that can convert ftd files to HTML so fpm can be considered an alternative to static site generators.