[FE training-materials-updates] Major rewrite of the block filesystem part

Thomas Petazzoni thomas.petazzoni at free-electrons.com
Wed Sep 7 16:18:37 CEST 2016


Repository : git://git.free-electrons.com/training-materials.git
On branch  : master
Link       : http://git.free-electrons.com/training-materials/commit/?id=443ceaab47322130db8e0300397a504610b731ec

>---------------------------------------------------------------

commit 443ceaab47322130db8e0300397a504610b731ec
Author: Thomas Petazzoni <thomas.petazzoni at free-electrons.com>
Date:   Wed Sep 7 16:18:37 2016 +0200

    Major rewrite of the block filesystem part
    
    - One slide about partitioning, to talk about the partition table and
      the tools to create/manipulate it
    - One slide about "dd" and how to use it to transfer data to/from a
      block device
    - Introduce filesystems more clearly:
     - One slide about ext2/ext3/ext4
     - Then the explanation about journaling
     - One slide about other Unix filesystems (btrfs, XFS, JFS, ReiserFS)
     - One slide about compatibility filesystems (VFAT, NTFS, HFS, ISO9660)
     - Slides about squashfs, f2fs, tmpfs
    - Slides on how to use block filesystems
     - ext2/3/4
     - loop for mounting images
     - squashfs
    
    Signed-off-by: Thomas Petazzoni <thomas.petazzoni at free-electrons.com>


>---------------------------------------------------------------

443ceaab47322130db8e0300397a504610b731ec
 .../sysdev-block-filesystems.tex                   | 294 +++++++++++++--------
 1 file changed, 180 insertions(+), 114 deletions(-)

diff --git a/slides/sysdev-block-filesystems/sysdev-block-filesystems.tex b/slides/sysdev-block-filesystems/sysdev-block-filesystems.tex
index c54f424..c3d7f6c 100644
--- a/slides/sysdev-block-filesystems/sysdev-block-filesystems.tex
+++ b/slides/sysdev-block-filesystems/sysdev-block-filesystems.tex
@@ -1,5 +1,7 @@
 \section{Block filesystems}
 
+\subsection{Block devices}
+
 \begin{frame}
   \frametitle{Block vs. flash}
   \begin{itemize}
@@ -13,9 +15,10 @@
     basis, in random order, without erasing.
     \begin{itemize}
     \item Hard disks, floppy disks, RAM disks
-    \item USB keys, Compact Flash, SD card: these are based on
-      flash storage, but have an integrated controller that emulates a block
-      device, managing and erasing flash sectors in a transparent way.
+    \item USB keys, SSD, Compact Flash, SD card, eMMC: these are based
+      on flash storage, but have an integrated controller that
+      emulates a block device, managing the flash in a transparent
+      way.
     \end{itemize}
   \item {\bf Raw flash devices} are driven by a controller on the
       SoC. They can be read, but writing requires erasing, and often occurs
@@ -46,32 +49,85 @@ major minor #blocks name
   \end{itemize}
 \end{frame}
 
-\begin{frame}
-  \frametitle{Traditional block filesystems}
-  Traditional filesystems
+\begin{frame}{Partitioning}
+  \begin{itemize}
+  \item Block devices can be partitioned to store different parts of a
+    system
+  \item The partition table is stored inside the device itself, and is
+    read and analyzed automatically by the Linux kernel
+    \begin{itemize}
+    \item \code{mmcblk0} is the entire device
+    \item \code{mmcblk0p2} is the second partition of \code{mmcblk0}
+    \end{itemize}
+  \item Two partition table formats:
+    \begin{itemize}
+    \item {\em MBR}, the legacy format
+    \item {\em GPT}, the new format, not yet used everywhere, but
+      becoming more and more common
+    \end{itemize}
+  \item Numerous tools to create and modify the partitions on a block
+    device: \code{fdisk}, \code{cfdisk}, \code{sfdisk}, \code{parted},
+    etc.
+  \end{itemize}
+\end{frame}
+
+\begin{frame}{Transfering data to a block device}
+  \begin{itemize}
+  \item It is often necessary to transfer data to or from a block
+    device in a {\em raw} way
+    \begin{itemize}
+    \item Especially to write a {\em filesystem image} to a block
+      device
+    \end{itemize}
+  \item This directly writes to the block device itself, bypassing any
+    filesystem layer.
+  \item The block devices in \code{/dev/} allow such {\em raw} access
+  \item \code{dd} is the tool of choice for such transfers:
+    \begin{itemize}
+    \item \code{dd if=/dev/mmcblk0p1 of=testfile bs=1M count=16}\\
+      Transfers 16 blocks of 1 MB from \code{/dev/mmcblk0p1} to
+      \code{testfile}
+    \item \code{dd if=testfile of=/dev/sda2 bs=1M seek=4}\\
+      Transfers the complete contents of \code{testfile} to
+      \code{/dev/sda2}, by blocks of 1 MB, but starting at offset 4 MB
+      in \code{/dev/sda2}
+    \end{itemize}
+  \end{itemize}
+\end{frame}
+
+\subsection{Available filesystems}
+
+\begin{frame}{Standard Linux filesystem format: ext2, ext3, ext4}
   \begin{itemize}
-  \item Can be left in a non-coherent state after a system crash or
-    sudden poweroff, which requires a full filesystem check after
-    reboot.
-  \item \code{ext2}: traditional Linux filesystem\\
-    (repair it with \code{fsck.ext2})
-  \item \code{vfat}: traditional Windows filesystem\\
-    (repair it with \code{fsck.vfat} on GNU/Linux or Scandisk on
-    Windows)
+  \item The standard filesystem used on Linux systems is the series of
+    \code{ext{2,3,4}} filesystems
+    \begin{itemize}
+    \item \code{ext2}
+    \item \code{ext3}, brought {\em journaling} compared to \code{ext2}
+    \item \code{ext4}, mainly brought performance improvements and
+      support for even larger filesystems
+    \end{itemize}
+  \item \code{ext4} is now the default filesystem used on most Linux
+    distributions
+  \item It supports all features Linux needs from a filesystem:
+    permissions, ownership, device files, symbolic links, etc.
   \end{itemize}
 \end{frame}
 
 \begin{frame}
   \frametitle{Journaled filesystems}
   \begin{columns}
-    \column{0.4\textwidth}
+    \column{0.6\textwidth}
     \begin{itemize}
-    \item Designed to stay in a correct state even after system crashes
-      or a sudden poweroff
-    \item All writes are first described in the journal before being
-      committed to files
+    \item Designed to stay in a coherent state even after system
+      crashes or a sudden poweroff
+    \item Writes are first described in the journal before being
+      committed to files (can be all writes, or only metadata writes
+      depending on the configuration)
+    \item Allows to skip a full disk check at boot time after an
+      unclean shutdown
     \end{itemize}
-    \column{0.6\textwidth}
+    \column{0.4\textwidth}
     \includegraphics[width=\textwidth]{slides/sysdev-block-filesystems/journal.pdf}
   \end{columns}
 \end{frame}
@@ -79,40 +135,111 @@ major minor #blocks name
 \begin{frame}
   \frametitle{Filesystem recovery after crashes}
   \begin{columns}
-    \column{0.6\textwidth}
-    \includegraphics[width=\textwidth]{slides/sysdev-block-filesystems/journal-recovery.pdf}
     \column{0.4\textwidth}
+    \includegraphics[width=\textwidth]{slides/sysdev-block-filesystems/journal-recovery.pdf}
+    \column{0.6\textwidth}
     \begin{itemize}
-    \item Thanks to the journal, the filesystem is never left in a
-      corrupted state
-    \item Recently saved data could still be lost
+    \item Thanks to the journal, the recovery at boot time is quick,
+      since the operations in progress at the moment of the unclean
+      shutdown are clearly identified
+    \item Does not mean that the latest writes made it to the storage:
+      this depends on syncing the changes to the filesystem.
     \end{itemize}
   \end{columns}
 \end{frame}
 
 \begin{frame}
-  \frametitle{Journaled block filesystems}
-  Journaled filesystems
+  \frametitle{Other Linux/Unix filesystems}
+  \begin{itemize}
+  \item \code{btrfs}, intended to become the next standard filesystem
+    for Linux. Integrates numerous features: data checksuming,
+    integrated volume management, snapshots, etc.
+  \item \code{XFS}, high-performance filesystem inherited from SGI
+    IRIX, still actively developed.
+  \item \code{JFS}, inherited from IBM AIX. No longer actively
+    developed, provided mainly for compatibility.
+  \item \code{reiserFS}, used to be a popular filesystem, but its
+    latest version \code{Reiser4} was never merged upstream.
+  \end{itemize}
+  All those filesystems provide the necessary functionalities for
+  Linux systems: symbolic links, permissions, ownership, device files,
+  etc.
+\end{frame}
+
+\begin{frame}
+  \frametitle{F2FS: filesystem for flash-based storage}
+  \url{http://en.wikipedia.org/wiki/F2FS}
+  \begin{itemize}
+  \item Filesystem that takes into account the characteristics of
+    flash-based storage: eMMC, SD cards, SSD, etc.
+  \item Developed and contributed by Samsung
+  \item Available in the mainline Linux kernel
+  \item For optimal results, need a number of details about the
+    storage internal behavior which may not easy to get
+  \item Benchmarks: best performer on flash devices most of the time: \\
+        See \url{http://lwn.net/Articles/520003/}
+  \item Technical details: \url{http://lwn.net/Articles/518988/}
+  \item Not as widely used as \code{ext3,4}, even on flash-based
+    storage.
+  \end{itemize}
+\end{frame}
+
+\begin{frame}
+  \frametitle{Squashfs: read-only filesystem}
   \begin{itemize}
-  \item \code{ext3}: \code{ext2} with journal extension\\
-    \code{ext4}: newest version in the family with many improvements.
-  \item \code{Btrfs} (``Butter FS'')\\
-    The next generation. Great performance. Now used in major
-    GNU/Linux distros.
-  \item The Linux kernel supports many other filesystems:
-    \code{reiserFS}, \code{JFS}, \code{XFS}, etc.  Each of them have
-    their own characteristics, but are more oriented towards server or
-    scientific workloads.
-  \item It's easy to switch filesystems. The best is to try each
-    and find out which yields the best performance on your own system.
+  \item Read-only, compressed filesystem for block devices. Fine for
+    parts of a filesystem which can be read-only (kernel, binaries...)
+  \item Great compression rate, which generally brings improved read
+    performance
+  \item Used in most live CDs and live USB distributions
+  \item Supports several compression algorithm (LZO, XZ, etc.)
+  \item Benchmarks: roughly 3 times smaller than ext3, and 2-4 times
+    faster (\url{http://elinux.org/Squash_Fs_Comparisons})
+  \item Details: \url{http://squashfs.sourceforge.net/}
   \end{itemize}
-  We recommend \code{ext2} for very small partitions ($<$ 5 MB),
-  because other filesystems need too much space for metadata
-  (\code{ext3} and \code{ext4} need about 1 MB for a 4 MB partition).
 \end{frame}
 
 \begin{frame}
-  \frametitle{Creating ext2/ext3/ext4 volumes}
+  \frametitle{Compatibility filesystems}
+  Linux also supports several other filesystem formats, mainly to be
+  interopable with other operating systems:
+  \begin{itemize}
+  \item \code{vfat} for compatibility with the FAT filesystem used in
+    the Windows world and on numerous removable devices
+    \begin{itemize}
+    \item This filesystem does {\em not} support features like
+      permissions, ownership, symbolic links, etc. Cannot be used for
+      a Linux root filesystem.
+    \end{itemize}
+  \item \code{ntfs} for compatibility with the NTFS filesystem used on
+    Windows
+  \item \code{hfs} for compatibility with the HFS filesystem used on
+    Mac OS
+  \item \code{iso9660}, the filesystem format used on CD-ROMs,
+    obviously a read-only filesystem
+  \end{itemize}
+\end{frame}
+
+\begin{frame}
+  \frametitle{tmpfs: filesystem in RAM}
+  \begin{itemize}
+  \item Not a block filesystem of course!
+  \item Perfect to store temporary data in RAM: system log files,
+    connection data, temporary files...
+  \item More space-efficient than ramdisks: files are directly in the
+    file cache, grows and shrinks to accommodate stored files
+  \item How to use: choose a name to distinguish the various tmpfs
+    instances you could have. Examples:\\
+    \code{mount -t tmpfs varrun /var/run}\\
+    \code{mount -t tmpfs udev /dev}
+  \item  See \kerneldoc{filesystems/tmpfs.txt} in kernel sources.
+  \end{itemize}
+\end{frame}
+
+\subsection{Using block filesystems}
+
+\begin{frame}
+  \frametitle{Creating ext2/ext3/ext4 filesystems}
   \begin{itemize}
   \item To create an empty ext2/ext3/ext4 filesystem on a block device or
     inside an already-existing image file
@@ -153,79 +280,22 @@ major minor #blocks name
 \end{frame}
 
 \begin{frame}
-  \frametitle{F2FS}
-  \url{http://en.wikipedia.org/wiki/F2FS}
-  \begin{itemize}
-  \item Filesystem optimized for block devices based on NAND flash
-  \item Available in the mainline Linux kernel
-  \item Benchmarks: best performer on flash devices most of the time: \\
-        See \url{http://lwn.net/Articles/520003/}
-  \item Technical details: \url{http://lwn.net/Articles/518988/}
-  \end{itemize}
-\end{frame}
-
-\begin{frame}
-  \frametitle{Squashfs}
-  Squashfs: \url{http://squashfs.sourceforge.net}
-  \begin{itemize}
-  \item Read-only, compressed filesystem for block devices. Fine for
-    parts of a filesystem which can be read-only (kernel, binaries...)
-  \item Great compression rate and read access performance
-  \item Used in most live CDs and live USB distributions
-  \item Supports LZO compression for better performance on embedded
-    systems with slow CPUs (at the expense of a slightly degraded
-    compression rate)
-  \item Now supports the XZ algorithm, for a much better compression
-        rate, at the expense of higher CPU usage and time.
-  \end{itemize}
-  Benchmarks: roughly 3 times smaller than ext3, and 2-4 times faster
-  (\url{http://elinux.org/Squash_Fs_Comparisons})
-\end{frame}
-
-\begin{frame}
-  \frametitle{Squashfs - How to use}
+  \frametitle{Creating squashfs filesystems}
   \begin{itemize}
   \item Need to install the \code{squashfs-tools} package
-  \item Creation of the image
+  \item Can only create an image: creating an empty {\em squashfs}
+    filesystem would be useless, since it's read-only.
+  \item To create a {\em squashfs} image:
     \begin{itemize}
-    \item On your workstation, create your filesystem image:\\
-      \code{mksquashfs rootfs/ rootfs.sqfs}
-    \item Caution: if the image already exists remove it first,\\
-      or use the \code{-noappend} option.
+    \item \code{mksquashfs -noappend rootfs/ rootfs.sqfs}
+    \item \code{-noappend}: re-create the image from scratch rather
+      than appending to it
     \end{itemize}
-  \item Installation of the image
+  \item Mounting a squashfs filesystem:
     \begin{itemize}
-    \item Let's assume your partition on the target is in
-      \code{/dev/sdc1}
-    \item Copy the filesystem image on the device\\
-      \code{dd if=rootfs.sqfs of=/dev/sdc1}\\
-      Be careful when using \code{dd} to not overwrite the incorrect
-      partition!
+    \item \code{mount -t squashfs /dev/<device> /mnt}
     \end{itemize}
-  \item Mount your filesystem:\\
-    \code{mount -t squashfs /dev/sdc1 /mnt/root}
-  \end{itemize}
-\end{frame}
-
-\begin{frame}
-  \frametitle{tmpfs}
-
-  Not a block filesystem of course!
-
-  Perfect to store temporary data in RAM: system log files, connection
-  data, temporary files...
-
-  \begin{itemize}
-  \item \code{tmpfs} configuration: \code{File systems -> Pseudo filesystems}\\
-    Lives in the Linux file cache. Doesn't waste RAM: unlike ramdisks, no need
-    to copy files to the file cache, grows and shrinks to accommodate stored files.
-    Saves RAM: can swap out pages to disk when needed.
-  \item How to use: choose a name to distinguish the various tmpfs
-    instances you could have. Examples:\\
-    \code{mount -t tmpfs varrun /var/run}\\
-    \code{mount -t tmpfs udev /dev}
   \end{itemize}
-  See \kerneldoc{filesystems/tmpfs.txt} in kernel sources.
 \end{frame}
 
 \begin{frame}
@@ -257,12 +327,8 @@ major minor #blocks name
   \item No details about the layer (Flash Translation Layer) they
     use. Details are kept as trade secrets, and may hide poor
     implementations.
-  \item Can use {\em flashbench}
-        (\url{https://github.com/bradfa/flashbench}) to find out
-        the erase block size and optimize filesystem formating.
-  \item Not knowing about the wear leveling algorithm,
-        it is highly recommended to limit the number of writes
-        to these devices.
+  \item Not knowing about the wear leveling algorithm, it is highly
+    recommended to limit the number of writes to these devices.
   \end{itemize}
 \end{frame}
 




More information about the training-materials-updates mailing list