[FE training-materials-updates] Major rewrite of the block filesystem part
Thomas Petazzoni
thomas.petazzoni at free-electrons.com
Wed Sep 7 16:18:37 CEST 2016
Repository : git://git.free-electrons.com/training-materials.git
On branch : master
Link : http://git.free-electrons.com/training-materials/commit/?id=443ceaab47322130db8e0300397a504610b731ec
>---------------------------------------------------------------
commit 443ceaab47322130db8e0300397a504610b731ec
Author: Thomas Petazzoni <thomas.petazzoni at free-electrons.com>
Date: Wed Sep 7 16:18:37 2016 +0200
Major rewrite of the block filesystem part
- One slide about partitioning, to talk about the partition table and
the tools to create/manipulate it
- One slide about "dd" and how to use it to transfer data to/from a
block device
- Introduce filesystems more clearly:
- One slide about ext2/ext3/ext4
- Then the explanation about journaling
- One slide about other Unix filesystems (btrfs, XFS, JFS, ReiserFS)
- One slide about compatibility filesystems (VFAT, NTFS, HFS, ISO9660)
- Slides about squashfs, f2fs, tmpfs
- Slides on how to use block filesystems
- ext2/3/4
- loop for mounting images
- squashfs
Signed-off-by: Thomas Petazzoni <thomas.petazzoni at free-electrons.com>
>---------------------------------------------------------------
443ceaab47322130db8e0300397a504610b731ec
.../sysdev-block-filesystems.tex | 294 +++++++++++++--------
1 file changed, 180 insertions(+), 114 deletions(-)
diff --git a/slides/sysdev-block-filesystems/sysdev-block-filesystems.tex b/slides/sysdev-block-filesystems/sysdev-block-filesystems.tex
index c54f424..c3d7f6c 100644
--- a/slides/sysdev-block-filesystems/sysdev-block-filesystems.tex
+++ b/slides/sysdev-block-filesystems/sysdev-block-filesystems.tex
@@ -1,5 +1,7 @@
\section{Block filesystems}
+\subsection{Block devices}
+
\begin{frame}
\frametitle{Block vs. flash}
\begin{itemize}
@@ -13,9 +15,10 @@
basis, in random order, without erasing.
\begin{itemize}
\item Hard disks, floppy disks, RAM disks
- \item USB keys, Compact Flash, SD card: these are based on
- flash storage, but have an integrated controller that emulates a block
- device, managing and erasing flash sectors in a transparent way.
+ \item USB keys, SSD, Compact Flash, SD card, eMMC: these are based
+ on flash storage, but have an integrated controller that
+ emulates a block device, managing the flash in a transparent
+ way.
\end{itemize}
\item {\bf Raw flash devices} are driven by a controller on the
SoC. They can be read, but writing requires erasing, and often occurs
@@ -46,32 +49,85 @@ major minor #blocks name
\end{itemize}
\end{frame}
-\begin{frame}
- \frametitle{Traditional block filesystems}
- Traditional filesystems
+\begin{frame}{Partitioning}
+ \begin{itemize}
+ \item Block devices can be partitioned to store different parts of a
+ system
+ \item The partition table is stored inside the device itself, and is
+ read and analyzed automatically by the Linux kernel
+ \begin{itemize}
+ \item \code{mmcblk0} is the entire device
+ \item \code{mmcblk0p2} is the second partition of \code{mmcblk0}
+ \end{itemize}
+ \item Two partition table formats:
+ \begin{itemize}
+ \item {\em MBR}, the legacy format
+ \item {\em GPT}, the new format, not yet used everywhere, but
+ becoming more and more common
+ \end{itemize}
+ \item Numerous tools to create and modify the partitions on a block
+ device: \code{fdisk}, \code{cfdisk}, \code{sfdisk}, \code{parted},
+ etc.
+ \end{itemize}
+\end{frame}
+
+\begin{frame}{Transfering data to a block device}
+ \begin{itemize}
+ \item It is often necessary to transfer data to or from a block
+ device in a {\em raw} way
+ \begin{itemize}
+ \item Especially to write a {\em filesystem image} to a block
+ device
+ \end{itemize}
+ \item This directly writes to the block device itself, bypassing any
+ filesystem layer.
+ \item The block devices in \code{/dev/} allow such {\em raw} access
+ \item \code{dd} is the tool of choice for such transfers:
+ \begin{itemize}
+ \item \code{dd if=/dev/mmcblk0p1 of=testfile bs=1M count=16}\\
+ Transfers 16 blocks of 1 MB from \code{/dev/mmcblk0p1} to
+ \code{testfile}
+ \item \code{dd if=testfile of=/dev/sda2 bs=1M seek=4}\\
+ Transfers the complete contents of \code{testfile} to
+ \code{/dev/sda2}, by blocks of 1 MB, but starting at offset 4 MB
+ in \code{/dev/sda2}
+ \end{itemize}
+ \end{itemize}
+\end{frame}
+
+\subsection{Available filesystems}
+
+\begin{frame}{Standard Linux filesystem format: ext2, ext3, ext4}
\begin{itemize}
- \item Can be left in a non-coherent state after a system crash or
- sudden poweroff, which requires a full filesystem check after
- reboot.
- \item \code{ext2}: traditional Linux filesystem\\
- (repair it with \code{fsck.ext2})
- \item \code{vfat}: traditional Windows filesystem\\
- (repair it with \code{fsck.vfat} on GNU/Linux or Scandisk on
- Windows)
+ \item The standard filesystem used on Linux systems is the series of
+ \code{ext{2,3,4}} filesystems
+ \begin{itemize}
+ \item \code{ext2}
+ \item \code{ext3}, brought {\em journaling} compared to \code{ext2}
+ \item \code{ext4}, mainly brought performance improvements and
+ support for even larger filesystems
+ \end{itemize}
+ \item \code{ext4} is now the default filesystem used on most Linux
+ distributions
+ \item It supports all features Linux needs from a filesystem:
+ permissions, ownership, device files, symbolic links, etc.
\end{itemize}
\end{frame}
\begin{frame}
\frametitle{Journaled filesystems}
\begin{columns}
- \column{0.4\textwidth}
+ \column{0.6\textwidth}
\begin{itemize}
- \item Designed to stay in a correct state even after system crashes
- or a sudden poweroff
- \item All writes are first described in the journal before being
- committed to files
+ \item Designed to stay in a coherent state even after system
+ crashes or a sudden poweroff
+ \item Writes are first described in the journal before being
+ committed to files (can be all writes, or only metadata writes
+ depending on the configuration)
+ \item Allows to skip a full disk check at boot time after an
+ unclean shutdown
\end{itemize}
- \column{0.6\textwidth}
+ \column{0.4\textwidth}
\includegraphics[width=\textwidth]{slides/sysdev-block-filesystems/journal.pdf}
\end{columns}
\end{frame}
@@ -79,40 +135,111 @@ major minor #blocks name
\begin{frame}
\frametitle{Filesystem recovery after crashes}
\begin{columns}
- \column{0.6\textwidth}
- \includegraphics[width=\textwidth]{slides/sysdev-block-filesystems/journal-recovery.pdf}
\column{0.4\textwidth}
+ \includegraphics[width=\textwidth]{slides/sysdev-block-filesystems/journal-recovery.pdf}
+ \column{0.6\textwidth}
\begin{itemize}
- \item Thanks to the journal, the filesystem is never left in a
- corrupted state
- \item Recently saved data could still be lost
+ \item Thanks to the journal, the recovery at boot time is quick,
+ since the operations in progress at the moment of the unclean
+ shutdown are clearly identified
+ \item Does not mean that the latest writes made it to the storage:
+ this depends on syncing the changes to the filesystem.
\end{itemize}
\end{columns}
\end{frame}
\begin{frame}
- \frametitle{Journaled block filesystems}
- Journaled filesystems
+ \frametitle{Other Linux/Unix filesystems}
+ \begin{itemize}
+ \item \code{btrfs}, intended to become the next standard filesystem
+ for Linux. Integrates numerous features: data checksuming,
+ integrated volume management, snapshots, etc.
+ \item \code{XFS}, high-performance filesystem inherited from SGI
+ IRIX, still actively developed.
+ \item \code{JFS}, inherited from IBM AIX. No longer actively
+ developed, provided mainly for compatibility.
+ \item \code{reiserFS}, used to be a popular filesystem, but its
+ latest version \code{Reiser4} was never merged upstream.
+ \end{itemize}
+ All those filesystems provide the necessary functionalities for
+ Linux systems: symbolic links, permissions, ownership, device files,
+ etc.
+\end{frame}
+
+\begin{frame}
+ \frametitle{F2FS: filesystem for flash-based storage}
+ \url{http://en.wikipedia.org/wiki/F2FS}
+ \begin{itemize}
+ \item Filesystem that takes into account the characteristics of
+ flash-based storage: eMMC, SD cards, SSD, etc.
+ \item Developed and contributed by Samsung
+ \item Available in the mainline Linux kernel
+ \item For optimal results, need a number of details about the
+ storage internal behavior which may not easy to get
+ \item Benchmarks: best performer on flash devices most of the time: \\
+ See \url{http://lwn.net/Articles/520003/}
+ \item Technical details: \url{http://lwn.net/Articles/518988/}
+ \item Not as widely used as \code{ext3,4}, even on flash-based
+ storage.
+ \end{itemize}
+\end{frame}
+
+\begin{frame}
+ \frametitle{Squashfs: read-only filesystem}
\begin{itemize}
- \item \code{ext3}: \code{ext2} with journal extension\\
- \code{ext4}: newest version in the family with many improvements.
- \item \code{Btrfs} (``Butter FS'')\\
- The next generation. Great performance. Now used in major
- GNU/Linux distros.
- \item The Linux kernel supports many other filesystems:
- \code{reiserFS}, \code{JFS}, \code{XFS}, etc. Each of them have
- their own characteristics, but are more oriented towards server or
- scientific workloads.
- \item It's easy to switch filesystems. The best is to try each
- and find out which yields the best performance on your own system.
+ \item Read-only, compressed filesystem for block devices. Fine for
+ parts of a filesystem which can be read-only (kernel, binaries...)
+ \item Great compression rate, which generally brings improved read
+ performance
+ \item Used in most live CDs and live USB distributions
+ \item Supports several compression algorithm (LZO, XZ, etc.)
+ \item Benchmarks: roughly 3 times smaller than ext3, and 2-4 times
+ faster (\url{http://elinux.org/Squash_Fs_Comparisons})
+ \item Details: \url{http://squashfs.sourceforge.net/}
\end{itemize}
- We recommend \code{ext2} for very small partitions ($<$ 5 MB),
- because other filesystems need too much space for metadata
- (\code{ext3} and \code{ext4} need about 1 MB for a 4 MB partition).
\end{frame}
\begin{frame}
- \frametitle{Creating ext2/ext3/ext4 volumes}
+ \frametitle{Compatibility filesystems}
+ Linux also supports several other filesystem formats, mainly to be
+ interopable with other operating systems:
+ \begin{itemize}
+ \item \code{vfat} for compatibility with the FAT filesystem used in
+ the Windows world and on numerous removable devices
+ \begin{itemize}
+ \item This filesystem does {\em not} support features like
+ permissions, ownership, symbolic links, etc. Cannot be used for
+ a Linux root filesystem.
+ \end{itemize}
+ \item \code{ntfs} for compatibility with the NTFS filesystem used on
+ Windows
+ \item \code{hfs} for compatibility with the HFS filesystem used on
+ Mac OS
+ \item \code{iso9660}, the filesystem format used on CD-ROMs,
+ obviously a read-only filesystem
+ \end{itemize}
+\end{frame}
+
+\begin{frame}
+ \frametitle{tmpfs: filesystem in RAM}
+ \begin{itemize}
+ \item Not a block filesystem of course!
+ \item Perfect to store temporary data in RAM: system log files,
+ connection data, temporary files...
+ \item More space-efficient than ramdisks: files are directly in the
+ file cache, grows and shrinks to accommodate stored files
+ \item How to use: choose a name to distinguish the various tmpfs
+ instances you could have. Examples:\\
+ \code{mount -t tmpfs varrun /var/run}\\
+ \code{mount -t tmpfs udev /dev}
+ \item See \kerneldoc{filesystems/tmpfs.txt} in kernel sources.
+ \end{itemize}
+\end{frame}
+
+\subsection{Using block filesystems}
+
+\begin{frame}
+ \frametitle{Creating ext2/ext3/ext4 filesystems}
\begin{itemize}
\item To create an empty ext2/ext3/ext4 filesystem on a block device or
inside an already-existing image file
@@ -153,79 +280,22 @@ major minor #blocks name
\end{frame}
\begin{frame}
- \frametitle{F2FS}
- \url{http://en.wikipedia.org/wiki/F2FS}
- \begin{itemize}
- \item Filesystem optimized for block devices based on NAND flash
- \item Available in the mainline Linux kernel
- \item Benchmarks: best performer on flash devices most of the time: \\
- See \url{http://lwn.net/Articles/520003/}
- \item Technical details: \url{http://lwn.net/Articles/518988/}
- \end{itemize}
-\end{frame}
-
-\begin{frame}
- \frametitle{Squashfs}
- Squashfs: \url{http://squashfs.sourceforge.net}
- \begin{itemize}
- \item Read-only, compressed filesystem for block devices. Fine for
- parts of a filesystem which can be read-only (kernel, binaries...)
- \item Great compression rate and read access performance
- \item Used in most live CDs and live USB distributions
- \item Supports LZO compression for better performance on embedded
- systems with slow CPUs (at the expense of a slightly degraded
- compression rate)
- \item Now supports the XZ algorithm, for a much better compression
- rate, at the expense of higher CPU usage and time.
- \end{itemize}
- Benchmarks: roughly 3 times smaller than ext3, and 2-4 times faster
- (\url{http://elinux.org/Squash_Fs_Comparisons})
-\end{frame}
-
-\begin{frame}
- \frametitle{Squashfs - How to use}
+ \frametitle{Creating squashfs filesystems}
\begin{itemize}
\item Need to install the \code{squashfs-tools} package
- \item Creation of the image
+ \item Can only create an image: creating an empty {\em squashfs}
+ filesystem would be useless, since it's read-only.
+ \item To create a {\em squashfs} image:
\begin{itemize}
- \item On your workstation, create your filesystem image:\\
- \code{mksquashfs rootfs/ rootfs.sqfs}
- \item Caution: if the image already exists remove it first,\\
- or use the \code{-noappend} option.
+ \item \code{mksquashfs -noappend rootfs/ rootfs.sqfs}
+ \item \code{-noappend}: re-create the image from scratch rather
+ than appending to it
\end{itemize}
- \item Installation of the image
+ \item Mounting a squashfs filesystem:
\begin{itemize}
- \item Let's assume your partition on the target is in
- \code{/dev/sdc1}
- \item Copy the filesystem image on the device\\
- \code{dd if=rootfs.sqfs of=/dev/sdc1}\\
- Be careful when using \code{dd} to not overwrite the incorrect
- partition!
+ \item \code{mount -t squashfs /dev/<device> /mnt}
\end{itemize}
- \item Mount your filesystem:\\
- \code{mount -t squashfs /dev/sdc1 /mnt/root}
- \end{itemize}
-\end{frame}
-
-\begin{frame}
- \frametitle{tmpfs}
-
- Not a block filesystem of course!
-
- Perfect to store temporary data in RAM: system log files, connection
- data, temporary files...
-
- \begin{itemize}
- \item \code{tmpfs} configuration: \code{File systems -> Pseudo filesystems}\\
- Lives in the Linux file cache. Doesn't waste RAM: unlike ramdisks, no need
- to copy files to the file cache, grows and shrinks to accommodate stored files.
- Saves RAM: can swap out pages to disk when needed.
- \item How to use: choose a name to distinguish the various tmpfs
- instances you could have. Examples:\\
- \code{mount -t tmpfs varrun /var/run}\\
- \code{mount -t tmpfs udev /dev}
\end{itemize}
- See \kerneldoc{filesystems/tmpfs.txt} in kernel sources.
\end{frame}
\begin{frame}
@@ -257,12 +327,8 @@ major minor #blocks name
\item No details about the layer (Flash Translation Layer) they
use. Details are kept as trade secrets, and may hide poor
implementations.
- \item Can use {\em flashbench}
- (\url{https://github.com/bradfa/flashbench}) to find out
- the erase block size and optimize filesystem formating.
- \item Not knowing about the wear leveling algorithm,
- it is highly recommended to limit the number of writes
- to these devices.
+ \item Not knowing about the wear leveling algorithm, it is highly
+ recommended to limit the number of writes to these devices.
\end{itemize}
\end{frame}
More information about the training-materials-updates
mailing list