combox-paper

notes and other things concerning combox
git clone git://git.ricketyspace.net/combox-paper.git
Log | Files | Refs

commit 29036930617490e50a2b00922a7a4119c7225b7b
parent 5f42ead73d8df02d7c6a1eb9d9627a5d76ff2175
Author: Siddharth Ravikumar <sravik@bgsu.edu>
Date:   Tue, 22 Mar 2016 09:47:11 -0400

space before \cite.

Diffstat:
report/chapters/1-intr.tex | 82++++++++++++++++++++++++++++++++++++++++----------------------------------------
report/chapters/2-lit-r.tex | 40+++++++++++++++++++---------------------
report/chapters/3-arch-d.tex | 75+++++++++++++++++++++++++++++++++++++++++++--------------------------------
report/chapters/4-testing.tex | 95++++++++++++++++++++++++++++++++++++++++---------------------------------------
4 files changed, 151 insertions(+), 141 deletions(-)

diff --git a/report/chapters/1-intr.tex b/report/chapters/1-intr.tex @@ -6,9 +6,9 @@ Internet companies have made it trivial for computer users to store data/information on their servers and at the same time there is a lot of evidence of governments and other powerful organizations being able -to access information/data stored on the Internet companies' -computers\cite{website:wikileaks-spyfiles}. Also, most companies add a -standard clause in their privacy policy that allows them to disclose +to access information/data stored on the Internet companies' computers +\cite{website:wikileaks-spyfiles}. Also, most companies add a standard +clause in their privacy policy that allows them to disclose information about users or information stored/created by users to ``third parties'': @@ -18,7 +18,7 @@ information about users or information stored/created by users to to (a) comply with the law; (b) protect any person from death or serious bodily injury; (c) prevent fraud or abuse of Dropbox or our users; or (d) protect Dropbox's property rights. -- Dropbox Privacy - Policy\cite{website:dropbox-privacy} + Policy \cite{website:dropbox-privacy} \end{quote} In this type of world, it would be good to have a program that would @@ -27,22 +27,22 @@ provided by Internet companies. combox aims to be one such program which not only encrypts but stores only a part of the encrypted data/information on the storage provided by an Internet company, thus making it non-trivial for ``third parties'' to access the user's -data/information in its entirety. Section \ref{1-sec-cb} gives a conceptual -introduction to combox; Section \ref{1-sec-cb-diff} enumerates how -combox is different from Vollmar's Combo-Box; lastly, section -\ref{1-sec-using-cb} contains information on how one can start using -combox. +data/information in its entirety. Section \ref{1-sec-cb} gives a +conceptual introduction to combox; Section \ref{1-sec-cb-diff} +enumerates how combox is different from Vollmar's Combo-Box; lastly, +section \ref{1-sec-using-cb} contains information on how one can start +using combox. \section{What is combox?}\label{1-sec-cb} -combox allows the user to store all of their files in the ``combox +combox allows the user to store all of their files in the ``combox directory'' and combox picks each file stored in the combox directory, -splits them into $N$ shards, encrypts each of the $N$ shards and spreads -the shards to $N$ node directories. A ``node directory'' is the -directory of the file storage provider (Dropbox directory is a node -directory). Figure \ref{fig:1-combox-overview-0}, illustrates how a file -called \verb+strunk-white.pdf+ is split, encrypted and spread across -$N$ node directories; shards \verb+strunk-white.pdf.shard0+ to +splits them into $N$ shards, encrypts each of the $N$ shards and +spreads the shards to $N$ node directories. A ``node directory'' is +the directory of the file storage provider (Dropbox directory is a +node directory). Fig. \ref{fig:1-combox-overview-0}, illustrates how a +file called \verb+strunk-white.pdf+ is split, encrypted and spread +across $N$ node directories; shards \verb+strunk-white.pdf.shard0+ to \verb+strunk-white.pdf.shardN+ are encrypted. \begin{figure}[h] @@ -74,22 +74,22 @@ $N$ node directories; shards \verb+strunk-white.pdf.shard0+ to \end{verbatim} \caption{combox overview - Splitting a file in the combox directory and spreading it across N node directories.} -\label{fig:1-combox-overview-0} + \label{fig:1-combox-overview-0} \end{figure} combox does not sync encrypted shards stored in the node directories -to the respective file storage providers' data store. Instead, it depends on -the respective file storage provider's client program to sync the -shards. +to the respective file storage providers' data store. Instead, it +depends on the respective file storage provider's client program to +sync the shards. combox can be used on all of the user's computers. For instance, the user can install combox on their second computer and combox will reconstruct the file from the encrypted shards stored in the node -directories into the combox directory on their second computer; Fig. -\ref{fig:1-combox-overview-1} illustrates this. Here too, combox +directories into the combox directory on their second computer; +Fig. \ref{fig:1-combox-overview-1} illustrates this. Here too, combox depends on the client program of the respective file storage provider -to sync shards to/from the file storage provider's data store and to/from -the respective node directory on the user's computer. +to sync shards to/from the file storage provider's data store and +to/from the respective node directory on the user's computer. \begin{figure}[h] \begin{verbatim} @@ -122,7 +122,7 @@ the respective node directory on the user's computer. \caption{combox overview - Reconstructing a file into the combox directory from the encrypted shards located in the node directories.} -\label{fig:1-combox-overview-1} + \label{fig:1-combox-overview-1} \end{figure} As of combox version \verb+0.2.3+, combox is compatible on GNU/Linux @@ -145,8 +145,8 @@ enumerated below: \item[File splitting] Combo-Box splits a file into shards based on the space available on each node directory \cite{vollmar-combo-box}, while combox is not yet cognizant about space left on each node - directory and splits the file into $N$ equal shards, where $N$ is equal - to the number of node directories. + directory and splits the file into $N$ equal shards, where $N$ is + equal to the number of node directories. \item[User Interface] Combo-Box is a graphical application while combox is mostly a command-line program. combox's configuration wizard has a graphical interface. The configuration wizard has a @@ -154,15 +154,15 @@ enumerated below: \item[Database] Combo-Box uses a traditional SQL database with two tables to keep track of files' shards, files' hash, files' last ``sync time'' and for ``security and stability'' uses stored - procedures that retrieve/store information in the - database \cite{vollmar-combo-box}. + procedures that retrieve/store information in the database + \cite{vollmar-combo-box}. combox on the other hand uses a key-value data store to track the - files stored in the combox directory using the pickleDB - library \cite{pylib:pickledb}. The key-value data store is a JSON - file and all access to this data store is done through an instance - of \verb+combox.silo.ComboxSilo+ - class\footnote{https://git.ricketyspace.net/combox/tree/combox/silo.py?id=fb7fdd218\#n29} + files stored in the combox directory using the pickleDB library + \cite{pylib:pickledb}. The key-value data store is a JSON file and + all access to this data store is done through an instance of + \verb+combox.silo.ComboxSilo+ class + \footnote{https://git.ricketyspace.net/combox/tree/combox/silo.py?id=fb7fdd218\#n29} which ensures that only one thread can read from or write to the data store at any time through a lock (\verb+threading.Lock+). In the data store, combox keeps track of the hashes of all the files @@ -170,12 +170,12 @@ enumerated below: dictionaries that track number of shards which have been create/moved/modified/deleted on another computer. -\item[Installation] Combo-Box uses the proprietary - InstallShield \cite{nonfree-installshield} to install the program, - setup shortcuts and registry settings\cite{vollmar-combo-box}. +\item[Installation] Combo-Box uses the proprietary InstallShield + \cite{nonfree-installshield} to install the program, setup shortcuts + and registry settings \cite{vollmar-combo-box}. combox is a python package, it can either be installed through - python's package manager (\verb+pip+\cite{py:pip}) with + python's package manager (\verb+pip+ \cite{py:pip}) with \verb+pip install combox+ or it can be installed from the source with the standard \verb+python setup.py install+. @@ -209,9 +209,9 @@ https://ricketyspace.net/combox/setup/. \subsection{Caveats} combox is extremely event-driven and depends on filesystem events to -do the correct action when a file is created/modified/moved/deleted, so -the user must make sure to start combox before starting the file +do the correct action when a file is created/modified/moved/deleted, +so the user must make sure to start combox before starting the file storage providers' client programs that sync encrypted shards to the respective node directories. On GNU/Linux distributions this can be automated through the distribution's start-up system (most GNU/Linux -distributions seem to use \verb+systemd+\cite{website:systemd}). +distributions seem to use \verb+systemd+ \cite{website:systemd}). diff --git a/report/chapters/2-lit-r.tex b/report/chapters/2-lit-r.tex @@ -15,7 +15,7 @@ publish/subscribe overlay paradigm and stores the content across multiple cloud storage providers such that only part of the content (in encrypted form) is stored on each file storage provider \cite{skycds}(Section \ref{2-skycds-sec}); and, lastly, -\verb+git-annex+, by Joey Hess\cite{person:joeyh}, that allows one to +\verb+git-annex+, by Joey Hess \cite{person:joeyh}, that allows one to version control and keep track of large files with a possibility of encrypting files that are stored in ``special remotes'' -- storage provided by Internet file storage providers (Section @@ -38,13 +38,13 @@ is stored in a SQLite database. Unlike combox, which depends the file storage provider' client to sync file fragments/shards to the file storage provider's data store, the Android application developed by Yeo et al. takes the responsibility to sync file fragments/shards to -each file storage provider and uses the OAuth -2.0\cite{protocol:oauth2} protocol for authorization. +each file storage provider and uses the OAuth 2.0 +\cite{protocol:oauth2} protocol for authorization. For encrypting file fragments, they use AES-256; the key for encrypting file fragments is derived from the user's password by using -Password-Based Key Derivation Function (PBKDF2)\cite{kaliski}. For -erasure coding they use the JigDFS library\cite{jigdfs}. The Android +Password-Based Key Derivation Function (PBKDF2) \cite{kaliski}. For +erasure coding they use the JigDFS library \cite{jigdfs}. The Android application is able do ``progressive streaming'' of media files; this means that large media files can be streamed in real-time from the from the file storage providers' data store; this is an attractive @@ -111,13 +111,12 @@ space and reliability. \section{git-annex}\label{2-gitannex-sec} \verb+git-annex+ allows one to version controlled large files that are -not usually feasible to version control under -\verb+git+\cite{program:git}. \verb+git-annex+ checks in the name and -other meta-data about the files in git and stores the actual content -under \verb+.git/annex+ directory. When a file is added to -\verb+git-annex+, a symlink of the file is created in place of the -file and the content of the file itself is stored under the -\verb+.git/annex+ directory. +not usually feasible to version control under \verb+git+ +\cite{program:git}. \verb+git-annex+ checks in the name and other +meta-data about the files in git and stores the actual content under +\verb+.git/annex+ directory. When a file is added to \verb+git-annex+, +a symlink of the file is created in place of the file and the content +of the file itself is stored under the \verb+.git/annex+ directory. For instance, say there is a file called \verb+deb-nicholson-80s.medium.webm+ that was downloaded from the @@ -169,17 +168,16 @@ to sync the repository to other \verb+git-annex+ repositories. It must be noted here that when the repository is synced, the file content itself is not transferred to the other \verb+git-annex+ repositories; only the file's name and its meta-data that is stored in a separate -git branch called \verb+git-annex+ are -transferred\cite{documentation:git-annex-hworks}. In order to create a -copy of a given file in another git annex repository, +git branch called \verb+git-annex+ are transferred +\cite{documentation:git-annex-hworks}. In order to create a copy of a +given file in another git annex repository, \verb+git annex get /path/to/filename.ext+ has to done. -\verb+git-annex+ has this feature called ``special -remotes''\cite{documentation:git-annex-sremotes}, that allows one to -push files checked into \verb+git-annex+ to storage provided by file -storage providers. At the time of writing this report, -\verb+git-annex+ supports pushing data to the following file storage -services: +\verb+git-annex+ has this feature called ``special remotes'' +\cite{documentation:git-annex-sremotes}, that allows one to push files +checked into \verb+git-annex+ to storage provided by file storage +providers. At the time of writing this report, \verb+git-annex+ +supports pushing data to the following file storage services: {\scriptsize \begin{itemize} diff --git a/report/chapters/3-arch-d.tex b/report/chapters/3-arch-d.tex @@ -5,7 +5,7 @@ examine things in greater detail, these simple models become inadequate and must be replaced by more refined models.}{\textit{Structure and Interpretation of Computer Programs, - Section 1.1.5}\cite{sicp}} + Section 1.1.5} \cite{sicp}} \section{Structure of combox} @@ -39,9 +39,9 @@ encrypted shards to their respective directories. Once the encrypted shards are synced to the node directories, combox will pick the encrypted shards -- \verb+humans.txt.shard0+, \verb+humans.txt.shard1+ -- decrypt them and reconstruct into \verb+humans.txt+ and place it in -the respective location under the combox directory; figure -\ref{fig:3-combox-structure} illustrates this. The process is similar -for file modification, deletion and rename/move. +the respective location under the combox directory; +Fig. \ref{fig:3-combox-structure} illustrates this. The process is +similar for file modification, deletion and rename/move. \subsection{combox configuration}\label{sec:3-combox-config} @@ -54,8 +54,8 @@ directories. The combox configuration is written to \verb+$HOME/.combox/config.yaml+. This YAML configuration file can be manually edited by the user. -The -\verb+config_cb+\footnote{https://git.ricketyspace.net/combox/tree/combox/config.py?id=fb7fdd21\#n90} +The \verb+config_cb+ +\footnote{https://git.ricketyspace.net/combox/tree/combox/config.py?id=fb7fdd21\#n90} function in the \verb+combox.config+ module is responsible for carrying out the combox configuration. Prior to version \verb+0.2.0+, the combox configuration was purely done through the Command Line @@ -66,15 +66,16 @@ switch. A demo of combox configuration using the graphical interface on GNU/Linux can be viewed -\url{https://ricketyspace.net/combox/combox-config-gui-glued-gnu.webm}{here}. +\url{https://ricketyspace.net/combox/combox-config-gui-glued-gnu.webm}. T he same demo of combox configuration using the graphical interface on OS X can be viewed -\url{https://ricketyspace.net/combox/combox-config-gui-glued-osx.webm}{here}. +\url{https://ricketyspace.net/combox/combox-config-gui-glued-osx.webm}. \subsection{combox directory monitor}\label{sec:3-combox-cdirm} combox directory monitor is an instance of -\verb+combox.events.ComboxDirMonitor+\footnote{https://git.ricketyspace.net/combox/tree/combox/events.py?id=fb7fdd21\#n42} +\verb+combox.events.ComboxDirMonitor+ +\footnote{https://git.ricketyspace.net/combox/tree/combox/events.py?id=fb7fdd21\#n42} monitoring the combox directory for changes. When changes are made to the combox directory, the combox directory monitor is responsible for correctly detecting the type of change and doing the right thing at @@ -105,7 +106,8 @@ and store the hash of file under its new name. \subsection{Node directory monitor}\label{sec:3-combox-nodirm} Node directory monitor is an instance of -\verb+combox.events.NodeDirMonitor+\footnote{https://git.ricketyspace.net/combox/tree/combox/events.py?id=fb7fdd21\#n352} +\verb+combox.events.NodeDirMonitor+ +\footnote{https://git.ricketyspace.net/combox/tree/combox/events.py?id=fb7fdd21\#n352} that monitors a node directory. When changes are made to the node directory, the node directory monitor is responsible for correctly detecting the type of change and doing the right thing at that @@ -188,13 +190,13 @@ remote file movement/deletion/creation/modification and triggering file reconstruction from the encrypted shards at the right time. The data store is a JSON file on the disk, stored by default at \\ -\verb+$HOME/.combox/silo.db+. The -\verb+combox.silo.ComboxSilo+\footnote{https://git.ricketyspace.net/combox/tree/combox/silo.py?id=v0.2.2\#n29} +\verb+$HOME/.combox/silo.db+. The \verb+combox.silo.ComboxSilo+ +\footnote{https://git.ricketyspace.net/combox/tree/combox/silo.py?id=v0.2.2\#n29} is the sole interface to read from and write to the data store. The data store is primarily accessed and modified by the combox directory monitor (\verb+ComboxDirMonitor+) and the node directory monitor (\verb+NodeDirMonitor+) through a shared \verb+threading. Lock+ that -ensures that only one entity\footnote{An entity can be the combox +ensures that only one entity \footnote{An entity can be the combox directory monitor or one of the node directory monitors} can access/modify the database at a time. @@ -248,7 +250,8 @@ This section gives an overview of each of the combox modules with extreme brevity. \begin{description} -\item[combox.cbox]\footnote{https://git.ricketyspace.net/combox/tree/combox/cbox.py?id=fb7fdd21} +\item[combox.cbox] + \footnote{https://git.ricketyspace.net/combox/tree/combox/cbox.py?id=fb7fdd21} This module contains \verb+run_cb+ function that starts/initiates combox; this function creates an instance \verb+threading.Lock+ for combox data store access and another instance of @@ -261,7 +264,8 @@ extreme brevity. function that parses commandline arguments, starts combox configuration if needed or loads the combox configuration file to start running combox. -\item[combox.config]\footnote{https://git.ricketyspace.net/combox/tree/combox/config.py?id=fb7fdd21} +\item[combox.config] + \footnote{https://git.ricketyspace.net/combox/tree/combox/config.py?id=fb7fdd21} Accommodates two import functions -- \verb+config_cb+ and \verb+get_nodedirs+. The \verb+config_cb+ is the combox configuration function that allows the user to configure combox; @@ -270,7 +274,8 @@ extreme brevity. combox. The \verb+get_nodedirs+ function returns, as a list, the paths of the node directories; this function use used in numerous places in other combox modules. -\item[combox.crypto]\footnote{https://git.ricketyspace.net/combox/tree/combox/crypto.py?id=fb7fdd21} +\item[combox.crypto] + \footnote{https://git.ricketyspace.net/combox/tree/combox/crypto.py?id=fb7fdd21} This has functions for encrypting and decrypting data; encrypting and decrypting shards (\verb+encrypt_shards+ and \verb+decrypt_shards+); a function for splitting a file into shards, @@ -284,7 +289,8 @@ extreme brevity. functions in this module are pretty much helper functions for \verb+split_and_encrypt+ and \verb+decrypt_and_glue+ functions and are not used by other modules. -\item[combox.events]\footnote{https://git.ricketyspace.net/combox/tree/combox/events.py?id=fb7fdd21} +\item[combox.events] + \footnote{https://git.ricketyspace.net/combox/tree/combox/events.py?id=fb7fdd21} This module took the most time to write and test and it is the most complex module in combox at the time of writing this report. It contains just two classes -- \verb+ComboxDirMonitor+ and @@ -298,28 +304,33 @@ extreme brevity. change happens in the node directory; subjectively, \verb+NodeDirMonitor+ is slightly more complex than the \verb+ComboxDirMonitor+. -\item[combox.file]\footnote{https://git.ricketyspace.net/combox/tree/combox/file.py?id=fb7fdd21} +\item[combox.file] + \footnote{https://git.ricketyspace.net/combox/tree/combox/file.py?id=fb7fdd21} This is the second largest module in combox. It contains utility functions for reading, writing, moving files/directories, hashing files, splitting a file into shards, gluing shards into a file, manipulating directories inside combox and node directories. -\item[combox.gui]\footnote{https://git.ricketyspace.net/combox/tree/combox/gui.py?id=fb7fdd21} +\item[combox.gui] + \footnote{https://git.ricketyspace.net/combox/tree/combox/gui.py?id=fb7fdd21} Contains the \verb+ComboxConfigDialog+ class; it is the graphical - interface for configuring combox. The class uses the Tkinter - library\cite{pylib:tkinter} for spawning graphical elements. Other - graphical libraries including PyQt\cite{pylib:qt} were considered, + interface for configuring combox. The class uses the Tkinter library + \cite{pylib:tkinter} for spawning graphical elements. Other + graphical libraries including PyQt \cite{pylib:qt} were considered, Tkinter was chosen over others due to compatibility with all Unix, Unix-like systems and Microsoft Windows and it is part of the standard python library from python version 3 on wards. -\item[combox.log]\footnote{https://git.ricketyspace.net/combox/tree/combox/log.py?id=fb7fdd21} +\item[combox.log] + \footnote{https://git.ricketyspace.net/combox/tree/combox/log.py?id=fb7fdd21} All the messages to \verb+stdout+ and \verb+stderr+ are sent through the \verb+log_i+ and \verb+log_e+ functions defined in this module. -\item[combox.silo]\footnote{https://git.ricketyspace.net/combox/tree/combox/silo.py?id=fb7fdd21} +\item[combox.silo] + \footnote{https://git.ricketyspace.net/combox/tree/combox/silo.py?id=fb7fdd21} Contains the \verb+ComboxSilo+ class which is the canonical interface for combox for managing information about the files in the combox directory. Internally, the \verb+ComboxSilo+ class uses the - pickleDB library\cite{pylib:pickledb}. -\item[combox.\_version]\footnote{https://git.ricketyspace.net/combox/tree/combox/\_version.py?id=fb7fdd21} + pickleDB library \cite{pylib:pickledb}. +\item[combox.\_version] + \footnote{https://git.ricketyspace.net/combox/tree/combox/\_version.py?id=fb7fdd21} This is \emph{private} module that contains variables that contain the value of the present version and release of combox. The \verb+get_version+ function in this module returns the full version @@ -337,7 +348,7 @@ the realm of the ``core functionality of combox''. The main reason behind this decision was to not indulge in trying to solve problems that others have already solved. -Accordingly, the \verb+watchdog+\cite{pylib:watchdog} library was +Accordingly, the \verb+watchdog+ \cite{pylib:watchdog} library was chosen for file monitoring. This library is compatible with Unix, Unix-like systems and Microsoft Windows. The \verb+pycrypto+ library \cite{pylib:pycrypto} was used for encrypting data. Combox uses AES @@ -405,17 +416,17 @@ Python has a package registry called CheeseShop \footnote{code name for Python Package Index, see https://wiki.python.org/moin/CheeseShop}. All packages registered at the CheeseShop can be installed using \verb+pip+ -- Python's -platform independent package management system\cite{py:pip} -- with: +platform independent package management system \cite{py:pip} -- with: \begin{verbatim} pip install packagename \end{verbatim} To make it easier for (python) users to install combox on their -machine, an effort was made to make it a python -package\cite{py:package-guide}. From version \verb+0.2.0+, combox has -been registered as a python package at the CheeseShop. (Python) Users -can now easily get a copy of combox on their machine with: +machine, an effort was made to make it a python package +\cite{py:package-guide}. From version \verb+0.2.0+, combox has been +registered as a python package at the CheeseShop. (Python) Users can +now easily get a copy of combox on their machine with: \begin{verbatim} pip install combox diff --git a/report/chapters/4-testing.tex b/report/chapters/4-testing.tex @@ -38,7 +38,7 @@ where unit tests helped was just before the \verb+v0.2.0+ release. Major changes, including the introduction of file locks in the \verb+ComboxDirMonitor+, were made to the \verb+combox.events+. When the unit tests were run OS X, two tests failed, revealing a difference -in behavior of watchdog\cite{pylib:watchdog} on GNU/Linux and OS X on +in behavior of watchdog \cite{pylib:watchdog} on GNU/Linux and OS X on file creation \footnote{https://git.ricketyspace.net/combox/commit/?id=8c86e7c28738c66c0e04ae7886b44dbcdfc6369exo}; without unit tests, there is a high probability that this bug would @@ -52,9 +52,9 @@ written feature correctly behaves for use cases that the author of the feature did not consider or did not think about while writing the respective feature. -Unit tests failed to reveal bugs \#5, \#6, \#7, \#10 and -\#11\footnote{https://git.ricketyspace.net/combox/plain/TODO.org}; -these bugs were found when manually testing combox. +Unit tests failed to reveal bugs \#5, \#6, \#7, \#10 and \#11 +\footnote{https://git.ricketyspace.net/combox/plain/TODO.org}; these +bugs were found when manually testing combox. \section{Manual testing}\label{sec:4-manual-testing} @@ -81,7 +81,7 @@ nodes. \begin{itemize} \item On the GNU/Linux machines, the official Dropbox client was used to sync the Dropbox node directory to Dropbox' data - store. \verb+rclone+\cite{program:rclone} was used to sync the + store. \verb+rclone+ \cite{program:rclone} was used to sync the Google Drive node directory to Google Drive' data store; at the time of testing, Google Drive does not have a client program for GNU/Linux which can sync to Google Drive's data store. @@ -106,7 +106,7 @@ machine (\verb+grus+) was a physical machine running Debian GNU/Linux testing. The node directories to scatter the files' shards were the Dropbox directory and Google Drive directory. The official Dropbox client was used to automatically sync files from the Dropbox directory -to the Dropbox' data store; \verb+rclone+\cite{program:rclone} was +to the Dropbox' data store; \verb+rclone+ \cite{program:rclone} was used to sync files from Google Drive directory to Google Drive' data store. @@ -118,8 +118,8 @@ store. backup file as a ``new file'' and it split it into shards, encrypted the shards and scattered the shards across the node directories. The right thing for combox to do was to ignore these backup files and do - nothing about them. This issue was fixed on - \verb+2015-09-29+\footnote{https://git.ricketyspace.net/combox/plain/TODO.org}. Now + nothing about them. This issue was fixed on \verb+2015-09-29+ + \footnote{https://git.ricketyspace.net/combox/plain/TODO.org}. Now the \verb+ComboxDirMonitor+, on a ``file created'' or ``file modified'' event, returns from the \verb+on_created+ or \verb+on_modified+ callback when it finds that the file is a @@ -145,8 +145,8 @@ store. \end{itemize} All of the above behavior of the Dropbox client broke - combox. Commits between \verb+3d714c5+ to - \verb+6e1133f+\footnote{https://git.ricketyspace.net/combox/log/?qt=range\&q=3d714c5..6e1133f} + combox. Commits between \verb+3d714c5+ to \verb+6e1133f+ + \footnote{https://git.ricketyspace.net/combox/log/?qt=range\&q=3d714c5..6e1133f} fixed combox by making it aware of Dropbox's client behavior. \end{itemize} @@ -237,7 +237,7 @@ automatically sync files from the Dropbox directory to the Dropbox' data store on both the GNU/Linux machine and the OS X machine; the official Google Drive client was used to automatically sync files from the Google Drive directory to Google Drive' data store on OS X and -\verb+rclone+\cite{program:rclone} was used to sync files from the +\verb+rclone+ \cite{program:rclone} was used to sync files from the Google Drive directory to Google Drive's data store on GNU/Linux. \subsubsection{Issues found} @@ -250,8 +250,8 @@ Google Drive directory to Google Drive's data store on GNU/Linux. unpredictable on this computer and if the first shard (shard0) was stored in the Dropbox directory, it will momentarily disappear before the most updated shard becomes available in the Dropbox - directory; this broke combox. This issue was fixed on - 2015-08-25\footnote{https://git.ricketyspace.net/combox/commit/?id=d5b52030348d40600b4c9256f76e5183a85fbb17}. This + directory; this broke combox. This issue was fixed on 2015-08-25 + \footnote{https://git.ricketyspace.net/combox/commit/?id=d5b52030348d40600b4c9256f76e5183a85fbb17}. This issue is not got to do with the nature of the setup but it is related to the Dropbox's behavior elaborated in section \ref{ch-4-2gnus-issues}. @@ -262,19 +262,19 @@ Google Drive directory to Google Drive's data store on GNU/Linux. respective location in the Google Drive directory; this behavior of the Google Drive client confused and broke combox. This issue was fixed 2015-09-06 by making combox aware of the official Google - Client's - behavior\footnote{https://git.ricketyspace.net/combox/commit/?id=37385a90f90cb9d4dfd13d9d2e3cbcace8011e9e}. + Client's behavior + \footnote{https://git.ricketyspace.net/combox/commit/?id=37385a90f90cb9d4dfd13d9d2e3cbcace8011e9e}. \item When a non-empty directory was move/renamed on another computer, the old directory was not getting properly deleted on this computer; this was happening because, sometimes, the files under the directory being renamed were not deleted when it was time for \verb+NodeDirMonitor+ to \verb+rmdir+ the old directory. This issue - was fixed on - 2015-09-12\footnote{https://git.ricketyspace.net/combox/commit/?id=9d14db03da5d10d5ab0d7cc76b20e7b1ed5523bf}. + was fixed on 2015-09-12 + \footnote{https://git.ricketyspace.net/combox/commit/?id=9d14db03da5d10d5ab0d7cc76b20e7b1ed5523bf}. \item It was found that \verb+combox.file.rm_path+ function failed when it was given a non-existent path to remove; this issue was - fixed on - 2015-09-12\footnote{https://git.ricketyspace.net/combox/commit/?id=422238eb4904de14842221fa09a2b4028801afb1}. + fixed on 2015-09-12 + \footnote{https://git.ricketyspace.net/combox/commit/?id=422238eb4904de14842221fa09a2b4028801afb1}. \end{itemize} \subsubsection{Demo} @@ -335,7 +335,7 @@ official Dropbox client was used to automatically sync files from Dropbox directory to Dropbox' data store on both the GNU/Linux machine and the OS X machine; the official Google Drive client was used to automatically sync files from the Google Drive directory to Google -Drive' data store on OS X and \verb+rclone+\cite{program:rclone} was +Drive' data store on OS X and \verb+rclone+ \cite{program:rclone} was used to sync files from the Google Drive directory to Google Drive's data store on GNU/Linux; the same USB stick (\verb+ZAPHOD+) was used on both GNU/Linux and Dropbox to store the third shard (shard2) of the @@ -454,8 +454,8 @@ for each dump. Stress testing was first done on \verb+2015-11-08+. In mid November 2015, the \\ \verb+ComboxDirMonitor+ was drastically modified to make -it use the file Lock shared by the instances of -\verb+NodeDirMonitor+\footnote{https://git.ricketyspace.net/combox/commit/?id=5aa1ba0c1dcad62931ba27bb66bf115233086d6c}. +it use the file Lock shared by the instances of \verb+NodeDirMonitor+ +\footnote{https://git.ricketyspace.net/combox/commit/?id=5aa1ba0c1dcad62931ba27bb66bf115233086d6c}. The hypothesis was that this change in \verb+ComboxDirMonitor+ directly affected the performance of combox and therefore the results that were got from stress testing on \verb+2015-11-08+ would no longer @@ -581,17 +581,17 @@ tests. \subsection{Tools used}\label{4-st-tu} -The \verb+dump+ -script\footnote{https://git.ricketyspace.net/combox-paper/plain/dumper/dump} +The \verb+dump+ script +\footnote{https://git.ricketyspace.net/combox-paper/plain/dumper/dump} was used to dump files to the combox directory between one second intervals. A night of Emacs Lisp indulgence made it possible to quickly slurp the required data from the combox output and calculate the average time to split and encrypt a file and the total amount of -time taken to process the files for a given -dump\footnote{https://git.ricketyspace.net/combox-paper/plain/scripts/dumps.el}; +time taken to process the files for a given dump +\footnote{https://git.ricketyspace.net/combox-paper/plain/scripts/dumps.el}; lastly \verb+org-mode+ was used to document all data gathered during -stress -testing\footnote{https://git.ricketyspace.net/combox-paper/plain/notes/benchmarks.org}. +stress testing +\footnote{https://git.ricketyspace.net/combox-paper/plain/notes/benchmarks.org}. \subsection{Observations}\label{4-st-o} @@ -611,19 +611,20 @@ testing\footnote{https://git.ricketyspace.net/combox-paper/plain/notes/benchmark \begin{itemize} -\item Figure \ref{fig:4-st-tt} shows the time it takes combox to - process files for a given file dump\footnote{A ``file dump'' here - means a bunch of files copied to the combox directory between 1 - sec intervals.}. As can be observed from the graph, the total time +\item Fig. \ref{fig:4-st-tt} shows the time it takes combox to process + files for a given file dump \footnote{A ``file dump'' here means a + bunch of files copied to the combox directory between 1 sec + intervals.}. As can be observed from the graph, the total time taken to process all the files tends almost linearly increase with - the increase in the size of the file dump\footnote{The ``size of the - file dump'' is the total size of all files in a given file dump.}. -\item Figure \ref{fig:4-st-atsae} show the average time it takes - combox to split and encrypt a file for a given file dump. There is a - steep increase in the average time from the \verb+424.79MiB+ dump - and the \verb+1620.00MiB+ dump, after which the average time to - split and encrypt a file seems to almost linearly increase; The main - reason for this is that the average file size for dumps from + the increase in the size of the file dump \footnote{The ``size of + the file dump'' is the total size of all files in a given file + dump.}. +\item Fig. \ref{fig:4-st-atsae} show the average time it takes combox + to split and encrypt a file for a given file dump. There is a steep + increase in the average time from the \verb+424.79MiB+ dump and the + \verb+1620.00MiB+ dump, after which the average time to split and + encrypt a file seems to almost linearly increase; The main reason + for this is that the average file size for dumps from \verb+1620.00MiB+ to \verb+10800.00MiB+ are the same. \end{itemize} @@ -643,7 +644,7 @@ testing\footnote{https://git.ricketyspace.net/combox-paper/plain/notes/benchmark \end{figure} \begin{itemize} -\item Figure \ref{fig:4-st-tt-diff} shows the graphs for the total +\item Fig. \ref{fig:4-st-tt-diff} shows the graphs for the total amount of time taken to process all files for a given file dump in the \verb+2016-01-16+ and \verb+2015-11-8+ stress test. The amount of time needed to process all fills seems to be reduced for the @@ -651,7 +652,7 @@ testing\footnote{https://git.ricketyspace.net/combox-paper/plain/notes/benchmark test results and it seems to be slightly higher for the \verb+10800.00MiB+ file dump when compared to the \verb+2015+ stress test. -\item Similarly, figure \ref{fig:4-st-atsae-diff} shows the graphs for +\item Similarly, Fig. \ref{fig:4-st-atsae-diff} shows the graphs for the average time to split and encrypt for a given file dump in the \verb+2016-01-16+ and the \verb+2015-11-8+ stress test. The average time taken seems to be almost the same for the \verb+424.79MiB+ and @@ -668,13 +669,13 @@ testing\footnote{https://git.ricketyspace.net/combox-paper/plain/notes/benchmark would get overwhelmed leading to the computer running out of memory and the load average sometimes peaking at \verb+8+. At first, it was assumed that there was a bug in combox which caused this to happen, - but later it was found that \verb+watchdog+\cite{pylib:watchdog} was - generating a large number ``file modified'' events when a huge file - (\verb+~500MiB+) was modified. To prevent \verb+watchdog+ from + but later it was found that \verb+watchdog+ \cite{pylib:watchdog} + was generating a large number ``file modified'' events when a huge + file (\verb+~500MiB+) was modified. To prevent \verb+watchdog+ from generating a large number ``file modified'' events for a single modification of a huge file, a delay proportional to the size of the file was created in the \verb+on_modified+ callback methods in both - \verb+ComboxDirMonitor+ and - \verb+NodeDirMonitor+\footnote{https://git.ricketyspace.net/combox/commit?id=7ed3c9cbe6e56223b043a23408474f9df08f119e}, + \verb+ComboxDirMonitor+ and \verb+NodeDirMonitor+ + \footnote{https://git.ricketyspace.net/combox/commit?id=7ed3c9cbe6e56223b043a23408474f9df08f119e}, this fixed the issue. \end{itemize}