The Linux VFS Model
Naming structure
Abstract structure of the Linux Virtual Filesystem (VFS) is presented.
Only the toplevel part – the naming structure – is described,
providing detailed pathname resolution semantics.
We formalize the hardlink structure, the mount structure and
describe derived structures used for pathname resolution.
We proceed by stepwise refinement –
we establish a system of (classes of) structures.
This system contains a
main development line
in which each structure is an extension of the previous one.
This way we partially provide an abstract specification of the
Linux Virtual Filesystem.
In fact, this document can be considered to be an abstract specification
of the Linux VFS in itself,
with only few specification steps (the most abstract ones) performed.
Author
 Ondřej Pavlata 
 Jablonec nad Nisou 
 Czech Republic 



Document date
Initial release  June 2, 2011 
Last major release 
June 2, 2011 
Last update  September 2, 2018 
(see Document history)

Warning

This document has been created without any prepublication review
except those made by the author himself.

The author has no experience with the Linux operating system
except for simple tests.
Linux version
This document refers to Linux 2.6.38.
The following configuration has been used for testing:
Distribution  Ubuntu 11.04 natty 
Desktop evironment  GNOME 2.32.1 (Ubuntu 20110414) 
Kernel  Linux 2.6.388generic 
Platform  i686 

GNU bash  version 4.2.8(1)release (i686pclinuxgnu) 
Table of contents
Introduction
VFS versus RTFS
As of time of writing this document,
the acronym RTFS has two meanings (known to the author)
in the context of file systems:
 (A)
The Real Time File System ([]).
 (B)
As used in [],
within the term RTFS country.
Gray color indicates that we are concerned with the second meaning here.
We start with the following premise:
Userobservable semantics of the Virtual Filesystem should be described
outside the RTFS country.
The main objective of this document is to contribute to the following shift in
proportion relating the VFS semantics
(the nonRTFS country is referred to as Abstract specification):
Abstract specification 
RTFS country


→

Abstract specification

RTFS country 

Preliminaries
Some familiarity with elementary algebra and order theory is assumed.
This involves such notions as
structure,
substructure,
sort,
generating set/structure,
(partial) function,
relation,
domain,
range,
restriction,
extension,
injectivity,
reflexivity,
transitivity,
monotonicity,
isomorphism,
(order) chain,
closure operator.
General conventions
Our notation mostly follows settheoretic conventions as shown in the following table.
Notation 
Description 
X × Y 
The Cartesian product of X and Y. 
UU 
The universal class – the settheoretic class of all sets. 
Id 
The identity class –
the settheoretic class of all pairs (x,x). 
Dom(R) 
The (definition) domain of a relation R
– the least X, w.r.t. ⊆, such that
R ⊆ X × UU. 
Rng(R) 
The range of a relation R
– the least Y such that
R ⊆ UU × Y. 
R ↾ X 
The (domain) restriction of the relation R to
X,
R ↾ X = R ∩ (X × UU).

R ○‣ S 
The composition of relations.
Inner triangle indicates the lefttoright direction
(first R, then S).
For functions with leftdot notation,
we write .r.s instead of .r ○‣ .s.

R^{−1} 
The inverse (opposite) of the relation R. 
X ↷ Y 
The set (class) of partial functions from X to Y.
We write .f : X ↷ Y instead of .f ∈ X ↷ Y.

X^{↷} 
The set (class) of all partial functions on X, shorthand for X ↷ X. 
X.r 
Usually, the image of X under .r,
X.r = { x.r  x ∈ X }. 
R^{∗} 
The reflexive transitive closure of the relation R
(with the reflexive closure depending on the context). 
R ⋈ S 
The binary equijoin of relations R and S.
R ⋈ S = { (x,y,z)  (x,y) ∈ R, (x,z) ∈ S }.
For functions .r, .s,
.r ⋈ .s is the map x ↦ (x.r, x.s).

X ⊎ Y 
X ⊎ Y = Z
iff X ∪ Y = Z and X ∩ Y = ∅. 
Mixture of partial functions
We introduce mixture
operators as follows
(f, g are relations).
Notation 
Definition 
Terminology 
Correspondents 
MySQL 
Ruby 
f ◑ g 
(f ↾ (Dom(f) ∖ Dom(g))) ∪ g 
patchmixture of g into f 
REPLACE 
Hash#update 
f ◐ g 
g ◑ f 

INSERT IGNORE 

f ◖◗ g 
f ◖◗ g = h iff
f ∪ g = h and Dom(f) ∩ Dom(g) = ∅ 
defmixture of f and g 
INSERT 

Obviously, if f and g are partial functions,
then so is f ◑ g
(resp. f ◖◗ g if defined).
Ruby conventions
We sometimes use conventions from the Ruby programming language
[],
in particular, in the following cases:

List handling functions (see Lists).

Ruby blocks are used for function definition:
{ x expression(x) }
means the map x ↦ expression(x).
Lists
By a list we mean a function whose domain, the index set, is
a downset of (ℕ, ≤), i.e.
either the set ℕ of all natural numbers, or
a set {0, …, n1} for some natural n.
A list I → X is said to be a list over X.
The following table shows notation used for lists over a set X.
Notation 
Description 
X^{∗} 
finite lists over X 
X^{+} 
nonempty finite lists over X 
X^{∞} 
infinite lists over X 

[a,b,c] 
the list {(0,a),(1,b),(2,c)} 
let x, y ∈ X^{∗}, A, B ⊆ X^{∗}

x[i] 
the function application x(i) 
x.length 
number of members of x 
x.last 
x.last[x.length1] 
x + y 
concatenation of lists x and y 
A + B 
{ a + b  a ∈ A, b ∈ B } 
x + B 
[x] + B 
x.shift 
x without the first member 
x.pop 
x without the last member 
x.uniq 
x without duplicate values 
x.select{block} 
the selection from x according to block 
Monounary algebra
A monounary algebra is a structure
(X, .p¯) such that

X is a set,

.p¯ is a function X → X.
We introduce the following notation and terminology:
Function 
Terminology 
Definition / description 
.p¯ : X → X 
x.p¯ is the
direct successor 
parentorroot 
of x 
x.p¯ = x.p if x.p is defined,
x.p¯ = x otherwise.

.p : X ↷ X 
x.p is the
strict direct successor 
parent 
of x 
.p is the restriction of .p¯ to nonfixed points:
x.p = x.p¯ if x.p¯ ≠ x,
x.p is undefined, otherwise.

.p¯() : X × ℕ → X 
x.p¯(i) is the ith successor of x 
x.p¯(i) is the ith application of .p¯ to x:
x.p¯(0) = x,
x.p¯(i) = x.p¯(i1).p¯ for i > 0.

.p() : X × ℕ ↷ X 
x.p(i) is the
strict ith successor 
ith ancestor 
of x

.p() is a restriction of .p¯():
x.p(i) = x.p¯(i) if i = 0 or
x.p¯(i1) ≠ x.p¯(i),
x.p(i) is undefined, otherwise.

.p¯s¯ : X → X^{∞} 
x.p¯s¯ is the successor list of x 
x.p¯s¯[i] = x.p¯(i)

.ps : X → X^{∗} ∪ X^{∞} 
x.ps is the
strict successor list 
ancestor list 
of x

x.ps[i] = x.p(i)

.r : X ↷ X 
x.r is the
of x 
x.r = x.ps.last if x.ps is finite,
x.r is undefined, otherwise.

Algebraic forest
By an algebraic forest (or just forest)
we mean a structure which has one of the following equivalent forms:
 (1)
A partial order whose every principal upset is a finite chain:
A partial order (X, ≤) such that
for every x ∈ X,
the set x.ps = { y ∈ X  x ≤ y }
is finite and totally ordered by ≤.
 (2)
A monounary algebra whose every nonempty subalgebra has a fixed point:
An algebra (X, .p¯)
with just one unary operator, .p¯ : X → X,
such that for every x ∈ X,
x.p¯(i) = x.p¯(i+1) for some natural i.
 (3)
A partial monounary algebra without total nonempty subalgebras:
A partial algebra (X, .p)
with just one partial unary operator,
the parent partial function .p: X ↷ X,
such that for every x ∈ X,
x.p(i) is defined for only finitely many i.
Because the set x.ps from (1) is a finite chain in ≤,
we can regard it as a finite list.
Then the correspondence (1)↔(2)↔(3) is established as follows:
(2)←(1)


x.p¯(i) = x.ps[i] if i ≤ x.ps.length,

x.p¯(i) = x.ps.last otherwise,



(1)←(2)


x ≤ y iff x.p¯(i) = y for some i,
i.e.

i.e. .p¯ is the smallest function on X
(w.r.t. ⊆)
satisfying (∗),

(X,≤) is the reflexive transitive closure of
(X, .p¯), (∗)



(3)←(1)


x.p(i) = x.ps[i] if i ≤ x.ps.length  1,

x.p(i) is undefined otherwise.

(1)←(3)


x ≤ y iff x.p(i) = y for some i,
i.e.

i.e. .p is the smallest relation on X
satisfying (∗), the
reflexive transitive reduction of (X,≤)
(a.k.a Hasse diagram),

(X,≤) is the reflexive transitive closure of
(X, .p) (∗).

The root map .r: X → X is defined by
x.r = x.ps.last.
Obviously,
.r is a closure operator w.r.t. (X, ≤).
An element x is a root if it satisfies any of the following
equivalent conditions:

x is .rclosed,

x is a fixed point w.r.t. .p¯,

x has undefined parent x.p.

x.p¯(i) (resp. x.p(i)) means the
ith application of .p¯
(resp. .p) to x.

The form (3) shows that an algebraic forest
can be viewed as a special case of a directed acyclic graph
(DAG) which in turn is a special case of a
digraph (a directed graph).
In this context the term
algebraic forest
is equivalent to
that of a rooted forest
[].
Algebraic tree
By an algebraic tree we mean an algebraic forest with exactly one root.
As an algebra, it is a structure (X, .p¯, r), such that

(X, .p¯) is an algebraic forest,

r is the only fixed point of .p¯.
An algebraic tree is said to be locally finite
iff
for each node x, the set {x} has finite preimage under .p¯
(i.e. each node has only finite number of children).
Pointed forest
By a pointed algebraic forest we mean an algebraic forest
equipped with an artificial
root.
As an algebra, it is a structure (X, .p¯, r), such that

(X, .p¯) is an algebraic forest,

r is an arbitrary element of X.
Componentwise finite linear order
For a partial order (X, ≤), the following are equivalent:
 (a)
Each component of (X, ≤) is a finite chain.
 (b)
Both (X, ≤) and its opposite (X, ≥) are algebraic forests.
For such a case we introduce bidirectional
terminology and notation according to the following table.
First 
Previous 
Next 
Last 
.end(−) 
root in (X, ≥) 
.nei(−) 
parent in (X, ≥) 
.nei(+) 
parent in (X, ≤) 
.end(+) 
root in (X, ≤) 
Ordered tree
By a locally finite ordered tree
we mean a structure which has one of the following equivalent forms:
 (1)
A locally finite algebraic tree together with sibling ordering:
A structure (X, ≤_{V}, ≤_{S}) such that

(X,≤_{V}) is a locally finite algebraic tree.

(X,≤_{S}) is a componentwise linear order.

.r_{S}.p_{V} = .p_{V},
i.e. components of ≤_{S} are siblingsets of ≤_{V}.
 (2)
A locally finite algebraic tree together with a complementary
horizontal
ordering:
A structure (X, ≤_{V}, ≤_{H}) such that

(X,≤_{V}) is a locally finite algebraic tree.

(X,≤_{H}) is a partial order.

≤_{V} and ≤_{H} are complementary, i.e.
(<_{V}) ⊎ (>_{V}) ⊎ (<_{H}) ⊎ (>_{H})
= X × X ∖ Id.

≤_{V} and ≤_{H} are consistent:
(<_{H}) = (≤_{V}) ○‣ (<_{H}) ○‣ (≥_{V}).
(That is, for every nodes x, y such that x <_{H} y,
all descendants of x precede all the descendants of y.)
 (3)
A locally finite algebraic tree together with a consistent linear extension:
A structure (X, ≤_{V}, ≤_{L}) such that

(X,≤_{V}) is a locally finite algebraic tree.

(X,≤_{L}) is a linear order.

(≤_{V}) ⊆ (≤_{L}),
i.e. ≤_{L} is an extension of ≤_{V}.

The same consistency condition as in (2) holds,
with (≤_{H}) defined as (≤_{L}) ∖ (<_{V}).
 (4)
A parenttochildrenlist map:
A structure (X, .qs) such that

X is a nonempty set.

.qs is function X → X^{∗}.uniq,
i.e. .qs maps X to nonrepetitive lists over X.

The set { (x,y)  x ∈ y.qs } is the parent partial function
of a locally finite algebraic tree on X.
The correspondence (1)–(4) is obtained as follows.

(<_{H}) = (≤_{V}) ○‣ (<_{S}) ○‣ (≥_{V})
= (≤_{L}) ∖ (<_{V})

(<_{L}) = (<_{H}) ⊎ (<_{V})

x <_{S} y iff
x.p_{V} = y.p_{V} and x <_{L} y.

x <_{S} y iff
x appears before y in some list from X.qs.

x.p_{V} = y iff
x ∈ y.qs.

x.qs = 
[ ] 
if x ∉ X.p_{V}, 
x.qs = 
y.ps_{S} 
otherwise,
where y is the unique element of X such that
y.p_{V} = x and y ∉ X.p_{S}. 
Ordered forest
By a finite ordered forest we mean a finite disjoint union of
locally finite ordered trees.
Tree navigation
Operators for ordered tree (or forest) navigation can be sumarized as follows.
Order 
Operators 
First 
Previous 
Next / Parent 
Last / Root 
≤_{V} 
Vertical order 

relevant: .qs 
.p_{V} 
parent 
.r_{V} 
root 
≤_{S} 
Sibling order 
.end_{S}(−) 
first sibling 
.nei_{S}(−) 
previous sibling 
.nei_{S}(+) 
next sibling 
.end_{S}(+) 
last sibling 
≤_{L} 
Linear order 
.end_{L}(−) 
leftmost 
.nei_{L}(−) 
nested predecessor 
.nei_{L}(+) 
nested successor 
.end_{L}(+) 
rightmost (equal to .r_{V}) 
≤_{H} 
Horizontal order 

Labelled transition system
A labelled transition system is a structure
(S, Λ, ) where

S is a set of states,

Λ is a set of labels,

is a subset of S × Λ × S,
elements of are called transitions.
Notation x y is used for (x, α, y) ∈ ().
A labelled transition system is called deterministic if
is a partial function on S × Λ.
Multidigraph
A multidigraph is a structure
(N, A, .s, .t) where

N is a set of nodes,

A is a set of arrows, disjoint from N,

.s and .t are functions A → N
assigning to each arrow its source and target node, respectively.
Naming multidigraph
By a naming multidigraph we mean a structure
(N, A, Σ, .σ, .s, .t) where

(N, A, .s, .t) is a multidigraph,

Σ is a naming domain – a set of names,

.σ is a function A → Σ, called (arrow) naming,
assigning to each arrow its name.
The structure is subject to the following naming consistency condition:


(NC) 
.s ⋈ .σ is injective,
i.e. different arrows with equal source have different names.

Lookup system
By a lookup system we mean a structure
(X, Σ, .ℓ()) where

X is a set (the lookup domain),

Σ is a set of names,

.ℓ() is a partial function X × Σ ↷ X,
called the Σlookup.
Proposition:
The following structures are equivalent:
 A lookup system.
 A deterministic labelled transition system.
 A naming multidigraph, up to identification of arrows.
□
Using the multidigraph sourcetarget terminology
and assuming finiteness of .ℓ(),
a lookup system can be interpreted as a relational database table of the following form:
Domains →X 
Σ 
X 
source 
name 
target 
  
  
  
Underline in
source
and
name
indicates a primary key
.
Pathnames
Elements of Σ^{∗}
(the set of all finite, possibly empty, sequences of names)
are called pathnames or paths.
Given a lookup system (X, Σ, .ℓ()),
the induced Σpathnamelookup
.ℓ^{∗}() : X × Σ^{∗} ↷ X
is naturally defined by
 .ℓ^{∗}([ ]) = Id ↾ X,
 .ℓ^{∗}([α_{1}, …, α_{n}]) =
.ℓ(α_{1}) ⋯ .ℓ(α_{n}).
Proposition:
For a partial function .f() : X × Σ^{∗} ↷ X the following
are equivalent:
 (a)
The map p ↦ .f(p)
is a monoid homomorphism between
(Σ^{∗}, +, [ ]) and
(X^{↷}, ○‣, Id ↾ X), i.e.
 .f(p).f(q) = .f(p + q),
for every p, q ∈ Σ^{∗}, and
 .f([ ]) = Id ↾ X.
 (b)
.f() is a Σpathname lookup,
i.e.
.f() = .ℓ^{∗}() for some .ℓ() : X × Σ ↷ X.
(Necessarily, .ℓ() is given by
.ℓ(α) = .f([α]).)
□
Again, we can interpret the induced pathname lookup as a relational database table,
this time better considered as a virtual, possibly infinite, database view.
Domains →X 
Σ^{∗} 
X 
source 
pathname 
target 
  
  
  
Pathname resolution system
By a pathname resolution system we mean a structure
(X, Σ, .ϖ()) where

X, Σ are disjoint sets
(with the same semantics as in a lookup system),

.ϖ() is a partial function X × Σ^{∗} ↷ X
such that for every x ∈ X and every pathnames p, q,
 x.ϖ([ ]) = x,
 .ϖ(p + q) ⊆ .ϖ(p).ϖ(q).
(If x.ϖ(p + q) is defined then
x.ϖ(p) and x.ϖ(p).ϖ(q) are defined and
x.ϖ(p).ϖ(q) = x.ϖ(p + q).)
Proposition:

The domain of .ϖ() is pathnameprefix closed, i.e.
for every (x,p) ∈ X × Σ^{∗},
 if x.ϖ(p) is defined then x.ϖ(p.pop) is defined.

Let .ℓ() : X × Σ ↷ X
be a lookup defined by x.ℓ(α) = x.ϖ([α]).
Then .ϖ() is a restriction of .ℓ^{∗}().
Corollary:

A pathname resolution system can be viewed as a
lookup system (X, Σ, .ℓ())
equipped with a restriction .ϖ()
of the induced pathname lookup .ℓ^{∗}() such that
the domain of .ϖ()
 is pathnameprefix closed, and
 coincides with the domain of .ℓ^{∗}() on pathnames of length ≤ 1.

A lookup system can be viewed as a pathname resolution system such that
.ℓ^{∗}() = .ϖ().
Reachability
Given a pathname resolution system (a lookup system)
we introduce reachability relations
as subsets of X × X
according to the following table
(we let P ⊆ Σ^{∗}, Τ ⊆ Σ):
Notation 
Definition 
Terminology 
.ℓ^{∗}(P) 
{ (x,y)  x.ℓ^{∗}(p) = y for some pathname p ∈ P } 
reachability (by .ℓ^{∗}()) over pathnames from P 
.ℓ^{∗} 
.ℓ^{∗}(Σ^{∗}) 
reachability (by .ℓ^{∗}()) 

.ϖ(P) 
as above,
with ℓ^{∗} replaced by ϖ. 
.ϖ 

.ℓ(Τ) 
{ (x,y)  x.ℓ(α) = y for some name α ∈ Τ } 
single step reachability (succession) over names from Τ 
.ℓ 
.ℓ(Σ) 
single step reachability / succession 
If x.ℓ^{∗} ∋ y then y is said to be reachable from
x.
(Double)pointed lookup system
By a doublepointed lookup system we mean a structure
(X, Σ, .ℓ(), ‹/›, r, d) where

(X, Σ, .ℓ()) is a lookup system,

‹/› is a distinguished element of Σ,

r and d are distinguished elements of X
(called the current root and the current directory,
respectively).
The structure is subject to the following condition:

{r, d} × {r}
⊆
.ℓ(‹/›)
⊆ X × {r}
(in particular, .ℓ(‹/›) maps constantly to r).
By a pointed lookup system
we mean a doublepointed lookup system with r = d.
The (global) resolution target partial function
.rt : Σ^{∗} ↷ X
maps pathnames to X as follows:

p.rt = r.ℓ^{∗}(p) if p[0] = r,

p.rt = d.ℓ^{∗}(p) otherwise.
Naming Forest
Forest Naming Domain
By a forest naming domain we mean a set Σ of names
such that
Σ 
= 
Σ_{down} 
⊎ 
Σ_{spec} 
∪ 

‖ 

∪ 
Σ_{Ό} 
= 
Σ_{down} 
⊎ 
Σ_{‰} 
where
 Σ_{down} is the set of down names,
 Σ_{spec} is the set op special names
with forestrelated semantics.
They are considered logical constants
(cf. []),
We distinguish these names by enclosing guillemets, like in
‹r› or ‹·›.
 Σ_{‰} is the set op permil names
containing just 3 special names ‹·›,
‹/› and ‹··›,
 Σ_{Ό} is the set of ordinary names.
Naming forest
By a naming forest we mean a structure
(A, ≤, Σ, .σ) where
 (A, ≤) is a forest,
 Σ is a forest naming domain,
 .σ
is a function A ↷ Σ_{down} called downnaming.
For x ∈ A, the downname x.σ, if defined,
is said to be the name of x.
The structure is subject to the following conditions:
 (1)
Dom(.σ) = Dom(.p),
i.e.
x has a name iff x has a parent.
 (2)
.p ⋈ .σ is injective, i.e.
different siblings have different names.
The induced downlookup operator
.ℓ_{down}() : A × Σ_{down} ↷ A
is defined as the inverse to .p ⋈ .σ, i.e.
x.ℓ_{down}(α) = y iff
y.p = x and y.σ = α.
Proposition:
Given the forest naming domain Σ, the
following structures are equivalent:

A naming forest (A, ≤, Σ, .σ).

A lookup system (A, Σ_{down}, .ℓ_{down}()) such that
 .ℓ_{down}() is injective,
 .ℓ_{down} (the single
down
step reachability)
is inverse to the parent partial function of a forest.
□
By a pointed naming forest we mean
a naming forest equipped with an artificial root r ∈ A.
By a naming tree we mean a pointed naming forest such that
{r} = A.r.
The default lookup operator
.ℓ() : A × Σ ↷ A
is defined as an extension of .ℓ_{down}() according to the
following table:
Name α 
Semantics 
Terminology 
.ℓ(α) 
a downname  αnamed child 
.ℓ_{down}(α) 
‹p›  parent  .p 
‹p¯›  parentorroot  .p¯ 
‹r›  component root  .r 
‹·›  self  Id ↾ A 
‹··›  dotdotparent 
.p¯ ◑ {(r,r)} 
if r is defined, 
.p¯  otherwise. 

‹/›  current root 
A × {r} 
if r is defined, 
∅  otherwise. 

Pathnames
The set Σ^{∗} of pathnames over a forest naming domain Σ
has a (special) nomenclature according to the following table.
Terminology for a pathname p ∈ P 
Pathname set P expression / description 
Examples 
Unixstyle correspondents 
ordinary (proper) 
absolute 
[‹/›] + (Σ_{down} ∪ {‹··›, ‹·›})^{∗}

[‹/›, "garden", ‹··›, "beds"] 
"/garden/../beds " 
[‹/›] 
"/ " 

canonic 
[‹/›] + Σ_{down}^{∗} 
[‹/›, "garden", "beds"] 
"/garden/beds " 
[‹/›] 
"/ " 
relative 
[‹/›] +
(Σ_{down} ∪ {‹··›, ‹·›})^{+}

["garden", ‹·›, "beds"] 
"garden/./beds " 
extraordinary 
pathnames p that are not ordinary:
 (a)
the empty pathname,
 (b)
p with a nonleading rootname ‹/›,
 (c)
p with a special name other than
‹/›, ‹··› or ‹·›.



(a) [ ] 
(b) ["garden", ‹/›] 
(c) [‹f›, "garden"] 
We also introduce notation Ϲ, Ά and
Ό
for the sets of all canonical, absolute and ordinary pathnames, respectively.
This leads to the following inclusion chain:
canonical 

absolute 

ordinary 

all pathnames 
Ϲ 
⊆ 
Ά 
⊆ 
Ό 
⊆ 
Σ^{∗} 
Note that the set of all ordinary pathnames can be expressed as
Ό = {[ ], [‹/›]} + (Σ_{down} ∪ {‹··›, ‹·›})^{∗}
∖ {[ ]}.
Pathname reduction
We introduce pathname reduction operators
Σ^{∗} → Σ^{∗} as follows:
Notation 
Terminology 
Definition (reduction rules) 
.reduct(‹··›) 
reduction


(p + [‹/›, ‹··›] + q).reduct(‹··›) = p + [‹/›] + q,

(p + [α, ‹··›] + q).reduct(‹··›) = p + q
if α ∈ Σ_{down},

s.reduct(‹··›) = s if the pathname s does not have
a subpathname of the form [‹/›, ‹··›]
or [α, ‹··›],
α ∈ Σ_{down}.
(I.e. the following rules apply:
[‹/›, ‹··›] ↦ [‹/›]
and [α, ‹··›] ↦ [ ], α ∈ Σ_{down}.)

.reduct_{℅}(‹··›) 
Let α = ‹··›. Then

p.reduct_{℅}(α) = p.reduct(α) + [‹·›]
if p.last = α and either
 p.reduct(α) = [ ] or
 p.reduct(α).last ∈ Σ_{down},

p.reduct_{℅}(α) = p.reduct(α) otherwise.

.reduct(‹·›) 
dot reduction 

(p + [‹·›] + q).reduct(‹·›) = p + q,

s.reduct(‹·›) = s if ‹·› ∉ s.
(All dot names ‹·› are removed.)

.reduct_{℅}(‹·›) 
The same rules apply as for .reduct_{℅}(‹··›) but with
α = ‹·›.
(Any nonfinal dot name ‹·› is removed.
The final ‹·›, if any,
is only removed if preceded by ‹/› or ‹··›.)

.reduct(‹/›) 
root reduction 

(p + [‹/›] + q).reduct(‹/›) = [‹/›] + q,

s.reduct(‹/›) = s if ‹/› ∉ s.shift.
(All names preceding the last ‹/› are removed.)

.reduct_{℅} 
reduction 

.reduct_{℅} = .reduct(‹/›).reduct_{℅}(‹··›).reduct_{℅}(‹·›).

.reduct 

.reduct = .reduct(‹/›).reduct(‹··›).reduct(‹·›).


The ℅ symbol means
care of trailing dot
.
The ℅variants of reduction preserve pathname characteristics
of ending with a downname, see proposition (B).
This is important for enforcing directory/nondirectory (leaf/nonleaf)
pathname resolution logic.

Pathname reduction .reduct_{℅}(‹·›).reduct_{℅}(‹··›) is used
in URI resolution
[].
The following example shows the impact of the ℅ feature.
Proposition:
 (A)
For every
.f, .g ∈ {.reduct(‹··›), , reduct_{℅}(‹··›), …, .reduct},
p, q, r ∈ Σ^{∗},
 .f.f = .f,
 .f.g = .g.f,
 (p + q + r).f = (p + q.f + r).f.
 (B)
For every
.f ∈ { reduct_{℅}(‹·›), reduct_{℅}(‹··›), reduct_{℅}},
p ∈ Σ^{∗},
 p.last ∈ Σ_{down} iff p.f.last ∈ Σ_{down}.
 (C)
Ά.reduct = Ϲ,
i.e. .reduct maps absolute pathnames to canonical pathnames.
Canonical pathname tree
Given the forest naming domain Σ,
the (free) canonical pathname tree
is the structure
C = (Ϲ, .p, Σ, .σ, [‹/›])
where

.p = .pop ↾ Ϲ ∖ {[‹/›]},

.σ = .last ↾ Ϲ ∖ {[‹/›]}.
Obviously, C is a naming tree with root [‹/›].
The corresponding lookup
.ℓ_{Ϲ}() : Ϲ × Σ ↷ Ϲ
is defined as follows:
Name α 
Value p.ℓ_{Ϲ}(α) 
a downname  p + [α] 
‹·›  p 
‹··› 
[‹/›]  if p = [‹/›], 
p.pop  otherwise 

‹/›  [‹/›] 
Proposition:
For every canonical pathname p and every ordinary pathname q,
 p.p¯ = (p + [‹··›]).reduct,
 p.ℓ_{Ϲ}^{∗}(q) = (p + q).reduct.
Induced pathname tree
Given a pointed lookup system
(X, Σ, .ℓ(), r) where
with the induced pathname resolution map
.rt : Σ^{∗} ↷ X,
we denote

Ͼ = { q ∈ Ϲ  q.rt is defined },
(i.e.
Ͼ is the set of all resolvable canonical pathnames)

.ℓ_{Ͼ}() = .ℓ_{Ϲ}() ∩
Ͼ × Σ × Ͼ.
Proposition:

Ͼ.p ⊆ Ͼ,

(Ͼ, .p, Σ, .σ, [‹/›])
is a naming tree
– we call it the induced pathname tree,

.ℓ_{Ͼ}() is the corresponding lookup.
□
Dotdot handling modes
Proposition:
Let (X, Σ, .ℓ(), r) be a pointed lookup system
with a forest naming domain Σ
such that
 .ℓ(‹··›) is defined for all x ∈ X,
but does not necessarily have the semantics of a naming forest,
 .ℓ(‹·›) is identity on X.
Let p be an ordinary pathname and consider
the 3 operators
.rt.ℓ^{∗}(p),
.ℓ_{Ϲ}^{∗}(p).rt,
.ℓ_{Ͼ}^{∗}(p).rt : Ϲ ↷ X.
 If ‹··› ∉ p then the operators are equal.
 If ‹··› ∈ p then
the operators are pairwise different in general.
□
We might say that the resolution Ϲ × Ό ↷ X
depends on dotdot handling mode.
This is summarized in the following table.
Operator Ϲ × Ό ↷ X 
Mode name 
Application example 
The cd utility  
.rt.ℓ^{∗}()  physical 
[]
or
[]
with the P switch


.ℓ_{Ϲ}^{∗}().rt 
logical  free 
[]
with the L switch

URI resolution [] 
.ℓ_{Ͼ}^{∗}().rt 
restricted 
[]
with the L switch


Filesystem forest
Filesystem forest
By a filesystem forest we mean a structure
(N, A, ≤, .t)
where
 N is a set of nodes,
 A is a set of arrows, disjoint from N,
 ≤ is a partial order between arrows,
 .t is a function A → N
assigning each arrow its target node.
The structure is subject to the following conditions:
 (1)
(A, ≤) is an algebraic forest.
 (2)
.t is injective on A.p,
i.e.
only leaf arrows a ≠ b can have equal target.
Filesystem forest as a multidigraph
Similarly to algebraic forests, a filesystem forest can be considered
of having one of more equivalent forms:
(1a) 
(N, A, ≤, .t) 
– the original form, 
(1b) 
(N, A, .p¯, .t) 
with .p¯ : A → A, 
(1c) 
(N, A, .p, .t) 
with .p : A ↷ A, 

(2b) 
(N, A, .s¯, .t) 
– the multidigraph (quiver) form, with .s¯ : A → N, 
(2c) 
(N, A, .s, .t) 
–
the partial multidigraph form, with .s : A ↷ N. 
The correspondence (1)–(2) is given by
.p¯ = .s¯.t^{−1} 

.s¯ = .p¯.t 
.p = .s.t^{−1} 
.s = .p.t 
S1: Filesystem forest
By an S1 structure we mean a filesystem forest
N = (N, A, ≤, .t)
such that the following is satisfied:


(S1~1) 
Components of (A, ≤) are finite.

(S1~2) 
Components of (A, ≤) have disjoint .timages
(i.e. .t^{−1}.r is a partial function N ↷ A.r).

Arrows from A are also called hardlinks.
Condition (S1~2) means that crosstree hardlinks are not allowed.
The following table shows basic correspondence of the S1 structure
to its inmemory representation in the Linux kernel.
 Inmemory representation 
Domain  Function  Type (structure)  Reference 
N  
inode  
 .t^{−1}.r 
 .i_sb>s_root 
A  
dentry  
 .t 
 .d_inode 
 .p¯ 
 .d_parent 
 .r 
 .d_sb>s_root 
S2: Mounted filesystem forest
By an S2 structure we mean an S1 structure
equipped with a mount structure
M = (M, ≤, ≤_{mpl}, .s¯, .t)
where
 M
is a finite set of mounts, disjoint from both N and A,
 ≤
is a partial order between mounts,
 ≤_{mpl}
is a componentwise linear order between mounts,
encoding mountpoint lists,
 .s¯, .t
are functions M → A assigning each mount its source
and target, respectively
(i.e. (M, A, .s¯, .t) is a multidigraph).
The structure is subject to the following conditions:


(S2~1) 
(M, ≤) is an algebraic forest.

(S2~2) 
(x.p,x.s) = (y.p,y.s)
iff
x.r_{mpl} = y.r_{mpl} for every mounts x, y
(i.e.
mounts have equal mountpoint source iff they are comparable in ≤_{mpl}).

(S2~3) 
M.t ⊆ A.r,
i.e. mount targets are roots in (A, ≤).

(S2~4) 
m.s¯.r = m.p¯.t for every mount m
(i.e. m's source belongs to m parent's target tree).

(S2~5) 
m.s¯.t = a.t = b.t implies a = b
for every mount m and arrows a, b
(i.e. (targets of) mount sources cannot be multiply hardlinked).

By .s : M ↷ A we mean the restriction of .s¯
to nonroot mounts.
The following table shows basic correspondence of the mount structure
to the Linux kernel source code.
 Inmemory representation 
Domain  Function  Type (structure)  Reference 
M  
vfsmount  
 .s¯ 
 .mnt_mountpoint 
 .t 
 .mnt_root 
 .p¯ 
 .mnt_parent 
 .r 
 .mnt_ns>root 
 .nei_{mpl}(−) 
 .mnt_hash.prev (*) 
 .nei_{mpl}(+) 
 .mnt_hash.next (*) 

Application of the
list_entry
macro is required:
x.nei_{mpl}(−)
corresponds to list_entry(y, struct vfsmount, mnt_hash)
iff y
equals x.mnt_hash.prev
and is not the head of x.mnt_hash
.

Mountpoint lists (i.e. components of (M, ≤_{mpl}) do not
have direct onetoone correspondence to
.mnt_hash
lists
in general
–
a hash list can be merged from several mountpoint lists.
The direct corresponence is established by a mountpoint equality check
in __lookup_mnt
.
Derived arrows
We denote
 Ѧ =
{ (m,a) ∈ M × A  m.t = a.r }
the set of derived arrows (or namespace points).
Informally,
a derived arrow is a hardlink in the context of one of the (possibly many)
mounts that bind the filesystem tree which the hardlink belongs to.
Proposition:
Ѧ is finite.
□
We use the (m,a) signature for derived arrows,
so that .m and .a are
the natural projections,
e.g. (m,a).a = a for (m,a) ∈ Ѧ.
Derived arrows Ѧ have the following form in the Linux source:
struct path {
struct vfsmount *mnt;
struct dentry *dentry;
};
Derived nodes
For the sake of completeness we also define the set
 N^{′} =
{ (m,a.t) ∈ M × N  (m,a) ∈ Ѧ }
of derived nodes.
Derived arrows are mapped to derived nodes by the
target function
.t : Ѧ ↠ N^{′} defined by (m,a).t = (m,a.t).
Mountpoints
The mountpoint list function
.mpl : Ѧ ↷ Ѧ^{∗} is defined by

x ∈ (m,a).mpl iff
(x.p,x.s) = (m,a),

x appears before y in (m,a).mpl iff
x ≤_{mpl} y.
Derived arrows with nonempty mountpoint list are called mountpoints.
Mountfront and mountback
The mountfrontpredecessor partial function
.m_{f} : Ѧ ↷ Ѧ is defined by

(m,a).m_{f} is undefined
if (m,a) is not a mountpoint
(i.e. the set X = { x ∈ M  (x.p,x.s) = (m,a) } is empty),

(m,a).m_{f} = (y, y.t) otherwise,
where y = (m,a).mpl.last
(the last mount in the mountpoint list of (m,a)).
The mountfront function
.f : Ѧ → Ѧ is defined recursively by

x.f = x
if
x.m_{f} is undefined,

x.f = x.m_{f}.f otherwise.
The mountbackparent partial function
.m_{b} : Ѧ ↷ Ѧ is defined by

(m,a).m_{b} is undefined
if
a ∉ A.r or (m,a) ∈ M.r × A.r,

(m,a).m_{b} = (m.p, m.s) otherwise.
The mountback function
.b : Ѧ → Ѧ is defined recursively by

x.b = x
if
x.m_{b} is undefined,

x.b = x.m_{b}.b otherwise.
Proposition:
 (i)
.f and .b are root maps
of forests (Ѧ, .m_{f}) and (Ѧ, .m_{b}),
respectively.
In addition, .m_{f} is injective.
 (ii)
.f.b = .b.
 (iii)
.b.p.a = .b.a.p.
 (iv)
A derived arrow x is a mountpoint iff x ∉ Ѧ.f.
□
The fine forest
The parent function .p¯ : Ѧ → Ѧ is defined by

(m,a).p¯ = (m,a.p) if a ∉ A.r, else

(m,a).p¯ = (m.p, m.s) if m ∉ M.r, else

(m,a).p¯ = (m,a).
We denote (Ѧ, ≤)
the reflexive transitive closure of (Ѧ, p¯)
and
.p the restriction of .p¯ to nonfixed points.
Proposition:
 (i)
(Ѧ, ≤) is an algebraic forest.
 (ii)
.f is idempotent and decreasing,
.b is idempotent, increasing and monotone
(a closure operator) in (Ѧ, ≤).
For every x,y ∈ Ѧ,

x.f.f = x.f ≤ x ≤ x.b = x.b.b,

x ≤ y implies x.b ≤ y.b.
 (iii)
(N^{′}, Ѧ, ≤, .t) is a filesystem forest.
 (iv)
(Ѧ, .m_{f}^{−1})
⊆ (Ѧ, .m_{b})
⊆ (Ѧ, .p)
(forest inclusion).
Proof:
(i)
Follows from the fact that
(Ѧ, .p) is a weak substructure of
the lexicographic product of (M, .p) and (A, .p),
respectively.
□
The upfront forest
The upfront parent partial function
.p_{f} : Ѧ ↷ Ѧ is defined by
.p_{f} = .b.p.f.
We denote (Ѧ, ≤_{f}) the reflexive transitive closure of
(Ѧ, p_{f}).
Proposition:
 (i)
x.b < x.p_{f}.b whenever x.p_{f} is defined.
 (ii)
(Ѧ, ≤_{f}) is an algebraic forest.
 (iii)
.b is strictly monotone w.r.t. (Ѧ, ≤_{f})
and (Ѧ, ≤),
i.e.
x <_{f} y implies x.b < y.b.
 (iv)
(N^{′}, Ѧ, ≤_{f}, .t) is a filesystem forest.
Proof:
(i)
(x.p_{f}).b = (x.b.p.f).b
= x.b.p.(f.b)
= x.b.p.b
> x.b.p > x.b.
□
The downfront forest
The downfront parent partial function
.p_{t} : Ѧ ↷ Ѧ is defined by
.p_{t} = .b.p ↾ Ѧ.b.f,
i.e.
x.p_{t} = y iff
x = x.b.f and y = x.b.p.
We denote (Ѧ, ≤_{t}) the reflexive transitive closure of
(Ѧ, p_{t}).
Proposition:
Same statements apply as for the upfront forest,
with f replaced by t.
□
S3: Naming
By an S3 structure we mean an S2 structure
equipped with a naming structure
(Σ, .σ)
assigning each nonroot hardlink a name that is distinct within sibling hardlinks,
i.e.


(S3~1) 
(A, ≤, Σ, .σ) is a naming forest.

This in particular defines the lookup function
.ℓ() : A × Σ ↷ A
for the hardlink forest (A, ≤).
The fine lookup
The fine lookup operator
.ℓ() : Ѧ × Σ ↷ Ѧ
defines navigation in the derived forest
(Ѧ, ≤) according to the following table.
α 
x.ℓ(α) where x = (m,a) 
Correspondent function from the Linux kernel 
a downname 
(m, a.ℓ(α)) 
__d_lookup 
‹r›  x.r  
‹p›  x.p  
‹p¯›  x.p¯ 
follow_dotdot without
equality check of nd>path and nd>root ,
follow_mount

‹m_{f}›  x.m_{f} 
lookup_mnt / follow_down_one 
‹f›  x.f 
follow_mount / follow_down 
‹m_{b}›  x.m_{b} 
follow_up 
‹b›  x.b 
iteration of follow_up in follow_dotdot 

The function
follow_down
version2.6.37 corresponds to
follow_down_one
version2.6.38.

The function
follow_down
version2.6.38 does not seem to have
a counterpart in version 2.6.37.
Downfront naming
We define partial functions .σ_{t} : Ѧ ↷ Σ_{down}
and
.ℓ_{t}() : Ѧ × Σ_{down} ↷ Ѧ
by

.σ_{t} = .b.a.σ ↾ Ѧ.f,

.ℓ_{t}(α) = .ℓ(α).f
for every downname α.
(Recall that .a is the projection (m,a) ↦ a).
Proposition:
(Ѧ, ≤_{t}, Σ, .σ_{t})
is a naming forest with
.ℓ_{t}() the correspondent downlookup.
□
The .ℓ_{3} lookup
The partial function
.ℓ_{3}() : Ѧ × Σ ↷ Ѧ
is defined as follows
α 
.ℓ_{3}(α) 
a downname 
.ℓ(α).f (= .ℓ_{t}(α)) 
‹r› 
.r_{f} 
‹p¯› 
.p¯_{f} 
Informally,
upsteps are performed in the upfront forest,
downsteps are performed in the downfront forest.
S4: Current root
By an S4 structure we mean an S3 structure
equipped with (r)
where
 r is a derived arrow called the current root.
We have the following correspondence:
Our notation 
Linux kernel 
r 
current>fs>root 
The .ℓ_{4} lookup
The lookup operator
.ℓ_{4}() : Ѧ × Σ ↷ Ѧ
is defined according to the following table:
α 
x.ℓ_{4}(α) 
Linux kernel function 
a downname 
x.ℓ(α).f 
do_lookup 
‹r›  x.r.f  
‹/›  r  
‹··› 
r.f  if x ≤ r ≤ x.b, 
x.b.p¯.f  otherwise 

follow_dotdot 
Proposition:
 (A)
The following holds for differences
between .ℓ_{4}() and .ℓ_{3}():

The down lookup is coincident.

Root lookups .ℓ_{4}(‹r›) and .ℓ_{3}(‹r›)
coincide
iff
Ѧ.r ⊆ Ѧ.f.

Root lookups .ℓ_{4}(‹/›) and .ℓ_{3}(‹r›)
coincide
iff
{r} = Ѧ.r.

Parent lookups
.ℓ_{4}(‹··›) and .ℓ_{3}(‹p¯›) coincide
iff
r ∈ Ѧ.r ⊆ Ѧ.f.
 (B)
Denote .p¯_{4} = .ℓ_{4}(‹··›).

(Ѧ, .p¯_{4}) is a forest.

Ѧ.r_{4} = Ѧ.r.f ∪ {r.f}.
□
Ancestor pathname
The function .apn : Ѧ → Σ^{+}
is defined according to the (first two columns of the) following table.
x.apn 
Condition 
String representation of
x.apn[0] in Linux VFS

[‹/›] 
x.f = x.r_{4} = r.f 
x ≤ r 
"/ " 
[‹/›] + x.ps_{4}.pop.reverse.map{ x x.b.a.σ } 
x.f ≠ x.r_{4} = r.f 
[‹r›] 
x.f = x.r_{4} ≠ r.f 
x ≰ r 
"(unreachable)/ " 
[‹r›] + x.ps_{4}.pop.reverse.map{ x x.b.a.σ } 
x.f ≠ x.r_{4} ≠ r.f 
The pathname x.apn is said to be the
ancestor pathname of x.
Proposition:
In general, .apn is NOT injective.
□
The canonical tree Ѧ_{Ⅰ}
Going from the pointed lookup system
(Ѧ, Σ, .ℓ_{4}(), r),
we denote
.rt_{4} : Σ^{∗} ↷ Ѧ
the induced pathname resolution map
and
Ͼ_{4} the set of all resolvable canonical paths.
We denote
Ѧ_{Ⅰ} = r.ℓ_{4}^{∗}(Σ_{down}^{∗}),
the canonical tree.
Proposition:
The maps
Ͼ_{4} Ѧ_{Ⅰ}
and
Ѧ_{Ⅰ} Ͼ_{4}
are inverse to each other.
□
The reachable (double)tree Ѧ_{Ⅱ}
We denote
Ѧ_{Ⅱ} =
Ѧ_{Ⅰ} ∪ r.ℓ_{4}^{∗}([‹··›] + Σ_{down}^{∗}),
the (root)reachable (double)tree.
Proposition:

Ѧ_{Ⅱ} = r.ℓ_{4}^{∗}(Ό)
i.e.
Ѧ_{Ⅱ} is the set of all derived arrows that
are reachable from the current root r by
an .ℓ_{4}^{∗}()lookup over ordinary pathnames.

(Ѧ_{Ⅱ}, ≤_{f}) = (Ѧ_{Ⅱ}, ≤_{t}),
i.e. restrictions of the upfront forest and the downfront forest to Ѧ_{Ⅱ}
are equal.

The set of roots of the forest (Ѧ_{Ⅱ}, ≤_{f}) equals
{ r, r.f }.
The forest (Ѧ_{Ⅱ}, ≤_{f}) is a tree
if r is not a mountpoint,
otherwise it has two components:

r.ℓ_{4}^{∗}(Σ_{down}^{∗}) = Ѧ_{Ⅰ} –
the
canonical
/ standard
tree,

r.f.ℓ_{4}^{∗}(Σ_{down}^{∗})
= Ѧ_{Ⅱ} ∖ Ѧ_{Ⅰ}
– the
front
tree.

The restriction of .ℓ_{4}() to
Ѧ_{Ⅱ} × Σ_{down}
is a downlookup of (Ѧ_{Ⅱ}, ≤_{f}).

The structure (Ѧ_{Ⅱ}, .p¯_{4}, r.f)
is an algebraic tree.

Ѧ_{Ⅱ} ∖ {r} ⊆ Ѧ.f,
i.e.
except possibly for the current root,
an element of Ѧ_{Ⅱ} is never a mountpoint.
□
/
(r) 
bin 
boot 
… 
usr 
var 

… 


… 



… 


… 



The front
tree can be created by
the following shell commands (root privileges are assumed).
cd
navigation between
Ѧ_{Ⅰ} and
Ѧ_{Ⅱ} ∖ Ѧ_{Ⅰ}
As of GNU bash version 4.2.8,
the switches P and L
play role for
the cd
(1)navigation to/from the front
tree.
This is shown in the following example
(note that symbolic links are NOT involved):
Root reachability stratification
We denote Ѧ_{Ⅲ} = r.p^{∗}^{−1}
= { x ∈ Ѧ  x ≤ r }.
We can now summarize
the rootreachability stratification
Ѧ_{Ⅰ} ⊆ Ѧ_{Ⅱ} ⊆ Ѧ_{Ⅲ} ⊆ Ѧ_{Ⅳ} =
Ѧ
by .ℓ_{4}^{∗}lookup over ordinary pathnames
as follows.
Stratum 
Description 
Ancestor pathname characteristics 



Ѧ_{Ⅰ} = r.ℓ_{4}^{∗}(Σ_{down}^{∗}) 
The canonical tree:
derived arrows reachable via a canonical path. 
r.ℓ_{4}^{∗}(x.apn) = x 


Ѧ_{Ⅱ} =
Ѧ_{Ⅰ} ∪ r.ℓ_{4}^{∗}([‹··›] + Σ_{down}^{∗})

The reachable (double)tree:
derived arrows reachable via an absolute path. 
r.ℓ_{4}^{∗}(x.apn) = x or
r.ℓ_{4}^{∗}([‹··›] + x.apn.shift) = x 

Ѧ_{Ⅲ} = r.p^{∗}^{−1} 
The inverse reachable tree:
derived arrows under the current root. 
x.apn[0] = ‹/› 
Ѧ_{Ⅳ} = Ѧ 



It is possible to have
x ∈ Ѧ_{Ⅰ},
y ∈ Ѧ_{Ⅱ} ∖ Ѧ_{Ⅰ}, and
z ∈ Ѧ_{Ⅲ} ∖ Ѧ_{Ⅱ}
such that
x.apn = y.apn = z.apn.

The following Ruby code shows a derived arrow
(originally pointed to by
/home
) outside
Ѧ_{Ⅲ}.
Dir.chdir "/home"
Dir.chroot "/usr"
puts Dir.pwd #> (unreachable)/home
S5: Basic file types
By an S5 structure we mean an S4 structure
equipped with (.btype)
where
 .btype is an attribute of nodes N
assigning each node its basic type, also called
the unix file type .
We introduce the following terminology / notation based on .btype:
Value y of x.btype 
Terminology for x 
Notation for the set { x ∈ X  x.btype = y } 
Linux kernel code:
x.i_mode & S_IFMT 
'' 
regular file 

S_IFREG 
'd' 
directory 
X.dirs 
S_IFDIR 
'l' 
symbolic link or symlink 
X.links 
S_IFLNK 
'b' , 'c' , 'p' , 's' ,
… 
not introduced 

In particular, N.dirs denotes the set of all directories.
The structure is subject to the following conditions:


(S5~1) 
A.s¯ ⊆ N.dirs,
i.e. hardlink sources are directories.

(S5~2) 
M.s¯.t ⊆ N.dirs,
i.e. mount sources point to directories.

(S5~3) 
r.a.t ∈ N.dirs,
i.e. the current root points to a directory.

The .ℓ_{5} lookup
The .ℓ_{5}() lookup operator is defined as
the restriction of .ℓ_{4}() to directories.
More precisely,
x.ℓ_{5}(α) = y
iff
y = x.ℓ_{4}(α) and x.a.t is a directory.
S6: Symbolic links
By an S6 structure we mean an S5 structure
equipped with (.lpath)
where
 .lpath is a function N.links → Σ^{∗}.
If x is a symbolic link, then x.lpath is called the
target pathname of x.
Path resolution stack
By a path resolution stack
we mean an element of Ѧ × Σ^{+}^{∗},
i.e. it is a pair
(a, paths) where

a is a derived arrow,

paths is a (possibly empty) list of nonempty pathnames.
We denote PRS the set of all path resolution stacks.
A path resolution stack corresponds to the nameidata
structure
from the Linux kernel source code according to the following table.
The PRS structure (a,paths) 
The nameidata structure 
a 
path 
paths 
saved_names 
paths.length 
depth 
We define the partial operator
.pp : Σ^{+}^{∗} ↷ Σ^{+}^{∗}
by

paths.pp is undefined if paths is empty, else

paths.pp = paths.pop if paths.last.length = 1, else

paths.pp = paths.pop + [paths.last.shift].
This means that .pp removes the first name from the last pathname in
paths and if this results in an empty last list then this list is removed too.
The partial operation .p_{F} : PRS ↷ PRS is then defined
by
 (A)
(a,paths).p_{F} is undefined if paths is empty and
a.a.t is not a symbolic link.
 (B)
Else if a.a.t is a symbolic link

(a,paths).p_{F} = (a.p, paths)
if a.a.t.lpath is empty, else

(a,paths).p_{F} = (a.p, paths + [a.a.t.lpath]).
 (C)
Else if a.a.t is a directory,
denote α = paths.last[0].

(a,paths).p_{F} is undefined if a.ℓ_{4}(α) is undefined, else

(a,paths).p_{F} = (a.ℓ_{4}(α), paths.pp).
 (D)
Otherwise (a.a.t is neither a directory nor a symbolic link)

(a,paths).p_{F} is undefined.
For x ∈ PRS, the value x.p_{F}, if defined,
is called the full resolution successor of x.
The partial operation of a normal resolution successor
.p_{N} : PRS ↷ PRS is defined as a restriction
of .p_{F}
to Ѧ × Σ^{+}^{+}.
Definition domains of .p_{F} and .p_{N} can
be summarized as follows.
Condition for x = (a, paths) ∈ PRS,
α = paths.last[0] 
POSIX error code 
Full resolution successor x.p_{F} 
Normal resolution successor x.p_{N} 
paths = [ ] and 
a.a.t ∈ N.links 

defined 
undefined 
a.a.t ∉ N.links 
undefined 
paths ≠ [ ] and 
a.a.t ∉ N.dirs ∪ N.links 
ENOTDIR 
a.a.t ∈ N.dirs and
a.ℓ_{4}(α) is undefined
 ENOENT 
otherwise 

defined and x.p_{F} = x.p_{N}

Proposition:

.p_{F} (resp. .p_{N}) has no fixed points.

The monounary algebra (PRS, p¯_{F})
(resp. (PRS, p¯_{N})) is NOT an algebraic forest in general.
□
In the following subsections,
we will use the endpoint (alias root) partial functions
.r_{F}, .r_{N} : PRS ↷ PRS
induced by monounary algebras (PRS, p¯_{F})
and (PRS, p¯_{N}), respectively.
Link follower
The (strict direct) link follower partial function
.p_{lnk} : Ѧ ↷ Ѧ
is defined by
a.p_{lnk} = b 
iff 
 (a, [ ]).p_{F}.r_{N} = (b, [ ]), and
 a ≠ b.

Note that only symbolic links can have link followers.
Link target
The link target partial function
.lt : Ѧ ↷ Ѧ
is defined by
a.lt = b 
iff 
(a, [ ]).r_{F} = (b, [ ]).

Proposition:
a.lt = b iff
a.r_{lnk} = b and b.a.t ∉ N.links
(i.e. b is the link target of a iff
b is the last link follower of a and does not point to a symlink).
□
Linked lookup
The linked lookup operators
.ℓ_{F}(), .ℓ_{N}() : Ѧ × Σ ↷ Ѧ
are defined by
a.ℓ_{F}(α) = b 
iff 
(a, [[α]]).r_{F} = (b, [ ])

(.ℓ_{N}() is defined the same way,
with F replaced by N).
Proposition:
Let α ∈ Σ,
d ∈ Ѧ such that d.a.t is directory. Then
.ℓ_{N}(α) = 
.lt.ℓ_{5}(α) 

d.ℓ_{N}(α) = 
d.ℓ_{5}(α) 
.ℓ_{F}(α) = 
.lt.ℓ_{5}(α).lt 
d.ℓ_{F}(α) = 
d.ℓ_{5}(α).lt 
□
Pathname resolution
The pathname resolution operators
.ϖ_{F}(), .ϖ_{N}() : Ѧ × Σ^{∗} ↷ Ѧ
are defined by
a.ϖ_{F}(p) = b 
iff 
 p = [ ] and a = b, or
 p ≠ [ ] and (a, [p]).r_{F} = (b, [ ]),

and a.ϖ_{N}(p) by the same way,
with F replaced by N.
This means that a nonempty pathname p is resolved to the derived arrow
b relative to a derived arrow a
if the path resolution stack (a, [p])
resolves to the endpoint (b, [ ]).
Proposition:

.ϖ_{F}() = .ϖ_{N}().lt.

.ϖ_{F}() = .ℓ_{F}^{∗}()
(the pathname lookup induced by .ℓ_{F}()).

The previous statement with F replaced by N.
□
Considering the pair of partial maps
.ϖ_{F}() and .ϖ_{N}()
as a single partial map
ϖ :
Ѧ × Σ^{∗} × {F, N} ↷ Ѧ,
we obtain the following correspondence with the Linux kernel source code:

Linux kernel source code 
Function  Arguments 
Return value 
Function  Arguments 
Input  Output 
ϖ 


vfs_path_lookup 


 a.m 

 mnt 

 a.a 

 dentry 

 p 

 name 

 rtype 

 flags & LOOKUP_FOLLOW 

 
b 
 
nd.path 
F / N modes in use
The following table shows examples of Linux system calls and shell commands
divided according to which pathname resolution mode they use.
F mode (full pathname resolution .ϖ_{F}()) 
N mode (normal pathname resolution .ϖ_{N}()) 
stat (2) 
lstat (2) 
stat (1) L 
stat (1) 
file (1) L 
file (1) h 
realpath (3) 


readlink (3) 
Induced pathname trees
Going from pointed lookup systems
(Ѧ, Σ, .ℓ_{F}(), r) and
(Ѧ, Σ, .ℓ_{N}(), r),
we denote
.rt_{F}, .rt_{N} : Σ^{∗} ↷ Ѧ
the induced pathname resolution maps
and
Ͼ_{F}, Ͼ_{N} the induced pathname trees,
i.e. Ͼ_{F} and Ͼ_{N} are
the sets all canonical pathnames resolvable via
.ϖ_{F}() and .ϖ_{N}(), respectively.
Proposition:

Ͼ_{N}.p¯ ⊆ Ͼ_{F} ⊆ Ͼ_{N}.

Ͼ_{4} ⊆ Ͼ_{N}.

.rt_{N} ↾ Ͼ_{4}
=
.rt_{4} ↾ Ͼ_{4}.
□
Virtual identity layers
There are four basic layers of identity of filesystem objects,
corresponding to the function chain
Ͼ_{F} Ѧ
A
N
(or
Ͼ_{N} Ѧ
A
N).

Identity layers are summarized in the following table.
Notation 
Terminology 
N 
Filesystem nodes 

A 
Hardlinks 
Lowlevel filesystem identity 
Ѧ 
Derived arrows (namespace points) 
Operating system identity 
Ͼ_{F} / Ͼ_{N} 
Resolvable canonical pathnames 
Browsing identity 
Hierarchical structures of particular layers do NOT correspond in general.
More precisely, the following diagram does not necessarily commute.
Ͼ_{F} 

Ѧ 

A 

N 
↑ 
.p¯ 
↑ 
.p¯_{4} 
↑ 
.p¯ 
⤉ 
Ͼ_{F} 

Ѧ 

A 

N 
The noncommutativity of the left square relates to
the difference between
logical (.p¯.rt_{F})
and
physical (.rt_{F}.p¯_{4})
dotdot handling
[],
[],
[].
Further resolvability restrictions
We define pathname resolution operators
.ϖ_{6F}() and
.ϖ_{6N}() as restrictions of
.ϖ_{F}() and
.ϖ_{N}(), respectively,
by imposing symbolic link limits (group B) according to the following table
(only the Fvariant is shown):
Condition for x = (a, [p]),
a ∈ Ѧ, p ∈ Σ^{∗} 
POSIX error code 
Group 
Terminology 
Description 
A 
Endpoint existence 
x.r_{F} is defined 
ELOOP 
Intermediate resolvability / pathname vanishment 
x.r_{F}.paths = [ ] 
ENOTDIR 
ENOENT 
B 
Symlink nesting limit 
y.paths.length ≤ 8 for every y ∈ x.ps_{F} 
ELOOP 
Symlink count limit 
x.ps_{F}.select{ y y.a.a.t ∈ N.links }.length ≤ 40 
ELOOP 
C 
Pathname size limit 
Not expressible in the S6 structure 
ENAMETOOLONG 
Access restriction 
Restriction of .ℓ_{4}(), not expressible in the S6 structure 
EACCES 
D 
Pathname properness 
(a) p is nonempty 
ENOENT 
(b) ‹/› ∉ p.shift 

(c) p has no extraordinary name 
For illustration, other restrictive conditions are listed:

(A)
Restrictions already applied to .ϖ_{F}().

(C)
Restrictions to be introduced in a further refinement of the S6 structure.

(D)
Rejecting extraordinary pathnames.
Proposition:

.ϖ_{6F}(p+q) ⊆ .ϖ_{6F}(p).ϖ_{6F}(q)
for every pathnames p, q.
Due to the symlink count limit,
equality does not hold in general.
(I.e. .ϖ_{6F}() is a pathname resolution operator,
but – in contrast to .ϖ_{F}() –
not a pathname lookup operator.)

The previous statement with F replaced by N.
Bibliographic references


Tim BernersLee, Roy Fielding, Larry Masinter,
Uniform Resource Identifier (URI): Generic Syntax,
STD 66, RFC 3986,
2005,
http://labs.apache.org/webarch/uri/rfc/rfc3986.html


Daniel P. Bovet, Marco Cesati,
Understanding the Linux Kernel, 3rd Edition,
O'Reilly,
2005


Andries Brower,
The Linux kernel: Some remarks on the Linux kernel,
2003,
http://www.win.tue.nl/~aeb/linux/lk/lk.html


R.T.E.H Dresens,
Testing the Linux VFS with two mCRL2 models,
Master thesis,
2006,


Miguel A. Ferreira, José N. Oliveira,
An Integrated Formal Methods ToolChain and its Application to Verifying a File System Model,
Formal Methods: Foundations and Applications, LNCS 5902,
2009,


Bruce Fields,
Notes on the Linux kernel,
http://www.fieldses.org/~bfields/kernel/


Glenn S. Fowler, David G. Korn, K.Phong Vo,
An Efficient File Hierarchy Walker,
In Proc. of the Summer '89 Usenix Conference,
1989,


Andy Galloway, Gerald Lüttgen, Jan Tobias Mühlberg, Radu I. Siminiceanu,
ModelChecking the Linux Virtual File System,
10th International Conference, VMCAI,
2009,


Wim H. Hesselink, Muhammad Ikram Lali,
Formalizing a hierarchical file system,
Electronic Notes in Theoretical Computer Science (ENTCS), Volume 259,
2009,


IEEE and The Open Group,
IEEE Std. 1003.12004: Portable Operating System Interface (POSIX) Base Specifications, Issue 6,
2004,
http://pubs.opengroup.org/onlinepubs/009695399/


IEEE and The Open Group,
IEEE Std. 1003.12008: Portable Operating System Interface (POSIX) Base Specifications, Issue 7,
2008,
http://pubs.opengroup.org/onlinepubs/9699919799/


Java Community Process, Day Software AG,
JCR v2.0 Specification,
2009,
http://www.day.com/specs/jcr/2.0/


Michael Kerrisk,
The Linux Programming Interface: A Linux and UNIX System Programming Handbook,
No Starch Press,
2010,


(The GNU/Linux community)
Linux man pages,
2011,
http://linux.die.net/man/


Robert Love,
Linux Kernel Development, 2nd Edition,
Sams Publishing,
2005


Wolfgang Mauerer,
Professional Linux Kernel Architecture,
Wiley Publishing, Inc.,
2008


David Flanagan, Yukihiro Matsumoto,
The Ruby Programming Language,
O'Reilly
2008


(nLab authors),
The nLab,
2011,
http://ncatlab.org/nlab/


Ondřej Pavlata,
The Ruby Object Model: Data Structure in Detail,
2011,
http://www.atalon.cz/rbom/rubyobjectmodel


Oracle Corporation,
MySQL 5.0 Reference Manual,
2011,
http://dev.mysql.com/doc/refman/5.0/en/


Chet Ramey,
Bash  the GNU shell,
;login: the USENIX Association newsletter,
December 1994,
http://tiswww.case.edu/php/chet/bash/article.pdf


Claudia Salzberg Rodriguez, Gordon Fischer, Steven Smolski,
The Linux Kernel Primer: A TopDown Approach for x86 and PowerPC Architectures,
Prentice Hall PTR,
2005


Gayathri Tk, Jayush Luniya,
RTFS : Real Time File System,
2005,
http://moss.csc.ncsu.edu/~mueller/rt/rt05/readings/g7/finalReport.pdf


Al Viro,
Re: silent semantic changes with reiser4,
LWN.net,
2004,
http://lwn.net/Articles/100183/

Browser compatibility
To be viewed correctly,
this document requires advanced browser features to be supported, including

CSS level 3,

HTML entities (Ϲ, Ͼ, ⤉, Ѧ, , ‹, ›, ¯, ·, ř, ‖, ‣, ›, ℅, ℓ, ℕ, Ⅰ, Ⅱ, Ⅲ, Ⅳ, →, ↠, ↦, ↷, ↾, ⇡, ∅, ∋, ∖, ∗, ∞, ≠, ≰, ⊎, ⋈, ⋯, Ά, Ό, □, ○, ◐, ◑, ◖, ◗, ϖ, Λ, Σ, Τ, α, ∩, ∪, é, ≥, >, ↔, …, ∈, ←, ≤, <, —, −, , –, ∉, ‰, ′, →, σ, ⊆, ×, ↑, ü, ),

JavaScript.
Document history
June  2  2011 
The initial release.

August  27  2018 
Correction in the definition of a multidigraph:
.s and .t are functions A → N.

September  2  2018 
Corrections in the Ordered tree section.
The ≤_{L+} order renamed to ≤_{L},
the ≤_{L−} order removed.

License
This work is licensed under a
Creative Commons AttributionNonCommercialShareAlike 3.0 License.