combox-paper

notes and other things concerning combox
git clone git://git.ricketyspace.net/combox-paper.git
Log | Files | Refs

commit d759884b339dad3faceb6be91ddefc7e85894ea3
parent a6cd780bc7f0ac2a4fc78a06173f85062c31c664
Author: Siddharth Ravikumar <sravik@bgsu.edu>
Date:   Tue, 22 Mar 2016 09:21:37 -0400

Whitespace and Cosmetic fixes.

Diffstat:
report/chapters/2-lit-r.tex | 168++++++++++++++++++++++++++++++++++++++++---------------------------------------
report/chapters/3-arch-d.tex | 152+++++++++++++++++++++++++++++++++++++++++--------------------------------------
report/chapters/4-testing.tex | 494++++++++++++++++++++++++++++++++++++++++---------------------------------------
report/chapters/5-con-f.tex | 21+++++++++++----------
4 files changed, 425 insertions(+), 410 deletions(-)

diff --git a/report/chapters/2-lit-r.tex b/report/chapters/2-lit-r.tex @@ -5,32 +5,33 @@ The idea of unifying the storage provided by multiple Internet file storage providers and storing all the content in an encrypted form is -not new. In the past, computer researchers and programmers have devised different -methods to use multiple file storage providers' storage space. This -chapter gives an overview of the work done by Yeo et al. in unifying -the storage provided by Dropbox, Box, Google Drive and Skydrive on -Android devices \cite{yeo}(Section \ref{2-yeo-sec}); SkyCDS, a content -delivery service, by Gonzalez et al., which uses publish/subscribe -overlay paradigm and stores the content across multiple cloud storage -providers such that only part of the content (in encrypted form) is -stored on each file storage provider \cite{skycds}(Section -\ref{2-skycds-sec}); and, lastly, \verb+git-annex+, by Joey -Hess\cite{person:joeyh}, that allows one to version control and keep -track of large files with a possibility of encrypting files that are -stored in ``special remotes'' -- storage provided by Internet file -storage providers (Section \ref{2-gitannex-sec}). +not new. In the past, computer researchers and programmers have +devised different methods to use multiple file storage providers' +storage space. This chapter gives an overview of the work done by Yeo +et al. in unifying the storage provided by Dropbox, Box, Google Drive +and Skydrive on Android devices \cite{yeo}(Section \ref{2-yeo-sec}); +SkyCDS, a content delivery service, by Gonzalez et al., which uses +publish/subscribe overlay paradigm and stores the content across +multiple cloud storage providers such that only part of the content +(in encrypted form) is stored on each file storage provider +\cite{skycds}(Section \ref{2-skycds-sec}); and, lastly, +\verb+git-annex+, by Joey Hess\cite{person:joeyh}, that allows one to +version control and keep track of large files with a possibility of +encrypting files that are stored in ``special remotes'' -- storage +provided by Internet file storage providers (Section +\ref{2-gitannex-sec}). \section{Multi Cloud Storage Prototype}\label{2-yeo-sec} -In the paper ``Leveraging client-side storage techniques for -enhanced use of multiple consumer cloud storage services on +In the paper ``Leveraging client-side storage techniques for enhanced +use of multiple consumer cloud storage services on resource-constrained mobile devices'', Yeo et al. show their Android mobile application, a prototype, which unifies storage provided by Dropbox, Box, Google Drive and SkyDrive. The application allows the user to store all their information in a single location on their -phone and it uses erasure coding \cite{weatherspoon} to split each file -into \verb`n + k` fragments and spreads the encrypted fragments across -storage provided by the file storage providers. All basic file +phone and it uses erasure coding \cite{weatherspoon} to split each +file into \verb`n + k` fragments and spreads the encrypted fragments +across storage provided by the file storage providers. All basic file operations -- Create, Rename, Update, Delete (CRUD) -- are possible. Information about the files stored in the unified location is stored in a SQLite database. Unlike combox, which depends the file @@ -51,30 +52,30 @@ feature in a ``resource constrained'' device where storage is expensive. Yeo et al. propose methods for achieving data de-duplication; file -compression based on file type; intelligent pre-fetching -and caching of file fragments and ``automatic restoration in -exploiting file-versioning''. These features were not implemented in -the prototype Android application and there is possibility of Yeo et +compression based on file type; intelligent pre-fetching and caching +of file fragments and ``automatic restoration in exploiting +file-versioning''. These features were not implemented in the +prototype Android application and there is possibility of Yeo et al. implementing these features in the future. -It becomes apparent that Yeo et al. work is of immense importance. This is particularly true when -we taking into consideration the research done by Yang et al., which -found that 59\% of the users who use ``cloud storage service'' access -the service through a smart phone and 42.2\% users access it for -audio/video \cite{yang}. The research by Yang et al. -suggests a trend of users' preference for small hand-held computers -over laptops and desktops. +It becomes apparent that Yeo et al. work is of immense +importance. This is particularly true when we taking into +consideration the research done by Yang et al., which found that 59\% +of the users who use ``cloud storage service'' access the service +through a smart phone and 42.2\% users access it for audio/video +\cite{yang}. The research by Yang et al. suggests a trend of users' +preference for small hand-held computers over laptops and desktops. \section{SkyCDS}\label{2-skycds-sec} SkyCDS, by Gonzalez et al., is a content delivery system that splits -and spreads the content across multiple file storage -providers \cite{skycds}. According to Gonzalez et al., the main reason -for designing and developing SkyCDS was to prevent content providers -from getting locked into just one file storage provider and to -minimize loss when a file storage provider goes out of business or if -there is temporary outage in the storage service provided by the file -storage provider. +and spreads the content across multiple file storage providers +\cite{skycds}. According to Gonzalez et al., the main reason for +designing and developing SkyCDS was to prevent content providers from +getting locked into just one file storage provider and to minimize +loss when a file storage provider goes out of business or if there is +temporary outage in the storage service provided by the file storage +provider. In SkyCDS, the content delivery to subscribers of the content is segregated into two distinct layers -- Metadata Flow Layer and the @@ -91,11 +92,12 @@ responsible for publishing the content using the ``delivery workflow'' When content has to be dispersed to $k$ file storage providers, the content is split into $n$ chunks, $n > k$. This file splitting seems to produce 66.7\% of redundancy overhead \cite{skycds}. This file -splitting scheme also looks very similar to erasure coding, but Gonzalez et -al. don't explicitly state that the content splitting scheme is indeed -``erasure coding''. The splitting of content is done by the ``delivery -workflow'' engine which is invoked when the publisher triggers the -action to publish the respective content to subscribers. +splitting scheme also looks very similar to erasure coding, but +Gonzalez et al. don't explicitly state that the content splitting +scheme is indeed ``erasure coding''. The splitting of content is done +by the ``delivery workflow'' engine which is invoked when the +publisher triggers the action to publish the respective content to +subscribers. To evaluate the effectiveness of SkyCDS, Gonzalez et al. state that they've done a case study using the data obtained from the European @@ -110,9 +112,9 @@ space and reliability. \verb+git-annex+ allows one to version controlled large files that are not usually feasible to version control under -\verb+git+\cite{program:git}. \verb+git-annex+ checks in the name -and other meta-data about the files in git and stores the actual -content under \verb+.git/annex+ directory. When a file is added to +\verb+git+\cite{program:git}. \verb+git-annex+ checks in the name and +other meta-data about the files in git and stores the actual content +under \verb+.git/annex+ directory. When a file is added to \verb+git-annex+, a symlink of the file is created in place of the file and the content of the file itself is stored under the \verb+.git/annex+ directory. @@ -148,7 +150,7 @@ add deb-nicholson-80s.medium.webm ok ↳ ls -l ... -lrwxrwxrwx 1 rsd rsd 207 May 5 2015 deb-nicholson-80s.medium.webm +lrwxrwxrwx 1 rsd rsd 207 May 5 2015 deb-nicholson-80s.medium.webm -> ../.git/annex/objects/3j/vG/SHA256E-s108196923--7de9484ee96908268e 21b451eb9805552c32b44da08e70ee861332c87352944f.webm/SHA256E-s10819692 3--7de9484ee96908268e21b451eb9805552c32b44da08e70ee861332c87352944f.w @@ -162,12 +164,12 @@ ebm } Now, the file \verb+deb-nicholson-80s.medium.webm+ is checked into -\verb+git-annex+ and the command \verb+git annex sync+ can be issued to sync the -repository to other \verb+git-annex+ repositories. It must be noted -here that when the repository is synced, the file content itself is -not transferred to the other \verb+git-annex+ repositories; only the -file's name and its meta-data that is stored in a separate git branch -called \verb+git-annex+ are +\verb+git-annex+ and the command \verb+git annex sync+ can be issued +to sync the repository to other \verb+git-annex+ repositories. It must +be noted here that when the repository is synced, the file content +itself is not transferred to the other \verb+git-annex+ repositories; +only the file's name and its meta-data that is stored in a separate +git branch called \verb+git-annex+ are transferred\cite{documentation:git-annex-hworks}. In order to create a copy of a given file in another git annex repository, \verb+git annex get /path/to/filename.ext+ has to done. @@ -180,36 +182,36 @@ storage providers. At the time of writing this report, services: {\scriptsize -\begin{itemize} -\item Amazon S3 -\item Amazon Glacier -\item Internet Archive via S3 -\item Box.com -\item Google drive -\item Google Cloud Storage -\item Mega.co.nz -\item SkyDrive -\item OwnCloud -\item Flickr -\item IMAP -\item Usenet -\item chef-vault -\item hubiC -\item pCloud -\item ipfs -\item Ceph -\item Blackblaze's B2 -\end{itemize} + \begin{itemize} + \item Amazon S3 + \item Amazon Glacier + \item Internet Archive via S3 + \item Box.com + \item Google drive + \item Google Cloud Storage + \item Mega.co.nz + \item SkyDrive + \item OwnCloud + \item Flickr + \item IMAP + \item Usenet + \item chef-vault + \item hubiC + \item pCloud + \item ipfs + \item Ceph + \item Blackblaze's B2 + \end{itemize} } All data pushed to file storage provider's servers can optionally be encrypted using one's GPG key. For instance, to encrypt data that is -pushed to the Amazon S3 special remote, the following command is -used \cite{docs:git-annex-as3}: +pushed to the Amazon S3 special remote, the following command is used +\cite{docs:git-annex-as3}: \begin{verbatim} $ git annex initremote cloud type=S3 keyid=2512E3C7 -initremote cloud (encryption setup with gpg key C910D9222512E3C7) +initremote cloud (encryption setup with gpg key C910D9222512E3C7) (checking bucket) (creating bucket in US) (gpg) ok $ git annex describe cloud "at Amazon's US datacenter" describe cloud ok @@ -222,16 +224,16 @@ size \verb+N+, to do that we do: \begin{verbatim} $ git annex initremote cloud type=S3 chunk=1MiB keyid=2512E3C7 -initremote cloud (encryption setup with gpg key C910D9222512E3C7) +initremote cloud (encryption setup with gpg key C910D9222512E3C7) (checking bucket) (creating bucket in US) (gpg) ok $ git annex describe cloud "at Amazon's US datacenter" describe cloud ok \end{verbatim} -Upon completion, each file that has to be pushed to the Amazon S3 special -remote is divided into 1MiB chunks, each chunk is encrypted using the -GPG key \verb+2512E3C7+ and the encrypted chunks are finally pushed to -the Amazon S3 remote. It must be noted here that unlike the Multi -Cloud Storage Prototype or SkyCDS or combox, in \verb+git-annex+ when -we are using file chunking all the chunks go to the same location -- -in this case, the Amazon S3 remote. +Upon completion, each file that has to be pushed to the Amazon S3 +special remote is divided into 1MiB chunks, each chunk is encrypted +using the GPG key \verb+2512E3C7+ and the encrypted chunks are finally +pushed to the Amazon S3 remote. It must be noted here that unlike the +Multi Cloud Storage Prototype or SkyCDS or combox, in \verb+git-annex+ +when we are using file chunking all the chunks go to the same location +-- in this case, the Amazon S3 remote. diff --git a/report/chapters/3-arch-d.tex b/report/chapters/3-arch-d.tex @@ -11,9 +11,9 @@ combox consists of two main components -- the combox directory and the node directories. The combox directory is the place where the user -stores all of their files; the node directories are the directories under -which encrypted shards of the files (in the combox directory) are -scattered to. A node directory is the file storage provider's +stores all of their files; the node directories are the directories +under which encrypted shards of the files (in the combox directory) +are scattered to. A node directory is the file storage provider's directory. For instance, the Dropbox directory and the Google Drive directory are node directories. @@ -22,15 +22,15 @@ combox splits \verb+humans.txt+ into \verb+N+ shards, where \verb+N+ is the number of node directories. If there are two node directories (Dropbox directory and Google Drive directory), then 2 shards are created. Each shard of the file is then encrypted and the encrypted -shards are spread evenly across the node directories. Now, the Dropbox client and -the Google client will sync the respective shards that was place under -their directories to their respective data store. +shards are spread evenly across the node directories. Now, the Dropbox +client and the Google client will sync the respective shards that was +place under their directories to their respective data store. \begin{figure}[h] -\includegraphics[scale=0.6]{3-combox-structure} -\caption{High level overview of how file creation works when combox is - setup on two computers.} -\label{fig:3-combox-structure} + \includegraphics[scale=0.6]{3-combox-structure} + \caption{High level overview of how file creation works when combox + is setup on two computers.} + \label{fig:3-combox-structure} \end{figure} Now, when the user moves to their second computer, the node clients @@ -47,10 +47,10 @@ for file modification, deletion and rename/move. The combox configuration wizard triggers automatically when combox finds that it is not configured. The combox configuration wizard -configures the combox directory; asks the user to point to the location -of the node directories; and reads the key (passphrase) to be used to -encrypt file shards that are spread across the node directories. The -combox configuration is written to +configures the combox directory; asks the user to point to the +location of the node directories; and reads the key (passphrase) to be +used to encrypt file shards that are spread across the node +directories. The combox configuration is written to \verb+$HOME/.combox/config.yaml+. This YAML configuration file can be manually edited by the user. @@ -58,16 +58,17 @@ The \verb+config_cb+\footnote{https://git.ricketyspace.net/combox/tree/combox/config.py?id=fb7fdd21\#n90} function in the \verb+combox.config+ module is responsible for carrying out the combox configuration. Prior to version \verb+0.2.0+, -the combox configuration was purely done through the Command Line Interface (CLI). From -\verb+0.2.0+ on wards, by default, the combox configuration is done -through a graphical interface; it is still possible to configure -combox through the CLI with the \verb+--cli+ switch. +the combox configuration was purely done through the Command Line +Interface (CLI). From \verb+0.2.0+ on wards, by default, the combox +configuration is done through a graphical interface; it is still +possible to configure combox through the CLI with the \verb+--cli+ +switch. A demo of combox configuration using the graphical interface on GNU/Linux can be viewed \url{https://ricketyspace.net/combox/combox-config-gui-glued-gnu.webm}{here}. -T he same demo of combox configuration using the graphical interface on -OS X can be viewed +T he same demo of combox configuration using the graphical interface +on OS X can be viewed \url{https://ricketyspace.net/combox/combox-config-gui-glued-osx.webm}{here}. \subsection{combox directory monitor}\label{sec:3-combox-cdirm} @@ -93,8 +94,8 @@ of the file in the local combox data store. When a file is deleted in the combox directory, the combox directory monitor will remove the encrypted shards of the file in the node -directories and get rid of the file's hash from the local combox -data store. +directories and get rid of the file's hash from the local combox data +store. When a file is moved/renamed in the combox directory, the combox directory monitor will move/rename encrypted shards in all the node @@ -170,17 +171,17 @@ directory. \subsection{combox data store}\label{sec:3-combox-db} To ``keep it simple, stupid'', combox tracks bare minimum information -about the files that are stored in the combox directory, depending on file -system events to do the right thing when changes takes place in the -combox directory. - -The only information that is stored in the combox data store with regards to a -file in the combox directory is its SHA-512 hash. The SHA-512 hash of -a file is enough information to detect changes in the file. In the -data store, there are also four dictionaries -- \verb+file_moved+, -\verb+file_deleted+, \verb+file_created+, \verb+file_modified+ -- -which track the number of shards of a file that wer -moved/deleted/created/modified due the respective file being +about the files that are stored in the combox directory, depending on +file system events to do the right thing when changes takes place in +the combox directory. + +The only information that is stored in the combox data store with +regards to a file in the combox directory is its SHA-512 hash. The +SHA-512 hash of a file is enough information to detect changes in the +file. In the data store, there are also four dictionaries -- +\verb+file_moved+, \verb+file_deleted+, \verb+file_created+, +\verb+file_modified+ -- which track the number of shards of a file +that were moved/deleted/created/modified due the respective file being moved/deleted/created/modified on another computer. These four dictionaries are primarily used by the \verb+NodeDirMonitor+ to detect remote file movement/deletion/creation/modification and triggering @@ -192,10 +193,10 @@ The data store is a JSON file on the disk, stored by default at \\ is the sole interface to read from and write to the data store. The data store is primarily accessed and modified by the combox directory monitor (\verb+ComboxDirMonitor+) and the node directory monitor -(\verb+NodeDirMonitor+) through a shared \verb+threading. Lock+ that ensures that only -one entity\footnote{An entity can be the combox directory monitor or - one of the node directory monitors} can access/modify the database -at a time. +(\verb+NodeDirMonitor+) through a shared \verb+threading. Lock+ that +ensures that only one entity\footnote{An entity can be the combox + directory monitor or one of the node directory monitors} can +access/modify the database at a time. Below is an illustration of the structure of the combox data store: @@ -214,9 +215,9 @@ Below is an illustration of the structure of the combox data store: \end{verbatim} The \verb+combox.silo.ComboxSilo+, which is the sole interface to read -from and write to the database, uses the pickleDB -library \cite{pylib:pickledb}. The pickleDB is a very basic key-value -store which allows one to store information in the JSON format. +from and write to the database, uses the pickleDB library +\cite{pylib:pickledb}. The pickleDB is a very basic key-value store +which allows one to store information in the JSON format. It must be noted that the combox data store on each computer is independent and does not communicate or make transactions with the @@ -224,7 +225,9 @@ combox data store located in other computers. \section{combox modules overview} -combox is spread into modules that have functions and/or classes. Currently, combox is considerably a small program consisting of the following files: +combox is spread into modules that have functions and/or +classes. Currently, combox is considerably a small program consisting +of the following files: \begin{verbatim} $ wc -l combox/*.py @@ -295,11 +298,11 @@ extreme brevity. change happens in the node directory; subjectively, \verb+NodeDirMonitor+ is slightly more complex than the \verb+ComboxDirMonitor+. -\item[combox.file]\footnote{https://git.ricketyspace.net/combox/tree/combox/file.py?id=fb7fdd21} This is the second largest module in combox. It - contains utility functions for reading, writing, moving - files/directories, hashing files, splitting a file into shards, gluing - shards into a file, manipulating directories inside combox and node - directories. +\item[combox.file]\footnote{https://git.ricketyspace.net/combox/tree/combox/file.py?id=fb7fdd21} + This is the second largest module in combox. It contains utility + functions for reading, writing, moving files/directories, hashing + files, splitting a file into shards, gluing shards into a file, + manipulating directories inside combox and node directories. \item[combox.gui]\footnote{https://git.ricketyspace.net/combox/tree/combox/gui.py?id=fb7fdd21} Contains the \verb+ComboxConfigDialog+ class; it is the graphical interface for configuring combox. The class uses the Tkinter @@ -311,14 +314,16 @@ extreme brevity. \item[combox.log]\footnote{https://git.ricketyspace.net/combox/tree/combox/log.py?id=fb7fdd21} All the messages to \verb+stdout+ and \verb+stderr+ are sent through the \verb+log_i+ and \verb+log_e+ functions defined in this module. -\item[combox.silo]\footnote{https://git.ricketyspace.net/combox/tree/combox/silo.py?id=fb7fdd21} Contains the \verb+ComboxSilo+ class which is the - canonical interface for combox for managing information about the - files in the combox directory. Internally, the \verb+ComboxSilo+ - class uses the pickleDB library\cite{pylib:pickledb}. -\item[combox.\_version]\footnote{https://git.ricketyspace.net/combox/tree/combox/\_version.py?id=fb7fdd21} This is \emph{private} module that contains - variables that contain the value of the present version and release - of combox. The \verb+get_version+ function in this module returns - the full version number; this function used by \verb+setup.py+. +\item[combox.silo]\footnote{https://git.ricketyspace.net/combox/tree/combox/silo.py?id=fb7fdd21} + Contains the \verb+ComboxSilo+ class which is the canonical + interface for combox for managing information about the files in the + combox directory. Internally, the \verb+ComboxSilo+ class uses the + pickleDB library\cite{pylib:pickledb}. +\item[combox.\_version]\footnote{https://git.ricketyspace.net/combox/tree/combox/\_version.py?id=fb7fdd21} + This is \emph{private} module that contains variables that contain + the value of the present version and release of combox. The + \verb+get_version+ function in this module returns the full version + number; this function used by \verb+setup.py+. \end{description} \section{DRY} @@ -332,13 +337,13 @@ the realm of the ``core functionality of combox''. The main reason behind this decision was to not indulge in trying to solve problems that others have already solved. -Accordingly, the \verb+watchdog+\cite{pylib:watchdog} library was chosen for file -monitoring. This library is compatible with Unix, Unix-like systems -and Microsoft Windows. The \verb+pycrypto+ -library \cite{pylib:pycrypto} was used for encrypting data. Combox uses -AES encryption scheme to encrypt file shards. The -\verb+pickleDB+ \cite{pylib:pickledb} library was used to store -information about files in the combox directory. +Accordingly, the \verb+watchdog+\cite{pylib:watchdog} library was +chosen for file monitoring. This library is compatible with Unix, +Unix-like systems and Microsoft Windows. The \verb+pycrypto+ library +\cite{pylib:pycrypto} was used for encrypting data. Combox uses AES +encryption scheme to encrypt file shards. The \verb+pickleDB+ +\cite{pylib:pickledb} library was used to store information about +files in the combox directory. Looking back, the decision to use external libraries reduced the complexity of combox, reduced the time to complete the initial working @@ -348,10 +353,10 @@ just testing and fixing issues in combox. \section{Operating system compatibility}\label{3-os-compat} combox was developed on a GNU/Linux machine. A conscious effort was -made to write the software in an operating system independent way. The top criteria -for choosing a library to use in combox was that it had to be -compatible on \emph{all} of the three major computing -platforms \footnote{GNU/Linux, OS X and, Microsoft Windows}. +made to write the software in an operating system independent way. The +top criteria for choosing a library to use in combox was that it had +to be compatible on \emph{all} of the three major computing platforms +\footnote{GNU/Linux, OS X and, Microsoft Windows}. Prior to the \verb+0.1.0+ release, combox was tested on OS X (See chapter \ref{ch:4}) and OS X specific issues that were found were @@ -362,9 +367,10 @@ After the initial release of combox, it was seen if combox would be compatible with Microsoft Windows out of the box. it was found that: \begin{itemize} -\item Setting up the paraphernalia to run combox was - non-trivial \cite{doc:combox-setup-windoze}. -\item The unit tests for the \verb+combox.file+ module failed on the Windows Operating System. +\item Setting up the paraphernalia to run combox was non-trivial + \cite{doc:combox-setup-windoze}. +\item The unit tests for the \verb+combox.file+ module failed on the + Windows Operating System. \end{itemize} At the time of writing the report, combox is at version \verb+0.2.3+ @@ -395,11 +401,11 @@ Finally install combox with: python setup.py install \end{verbatim} -Python has a package registry called CheeseShop \footnote{code name for - Python Package Index, see https://wiki.python.org/moin/CheeseShop}. -All packages registered at the CheeseShop can be installed using -\verb+pip+ -- Python's platform independent package management -system\cite{py:pip} -- with: +Python has a package registry called CheeseShop \footnote{code name + for Python Package Index, see + https://wiki.python.org/moin/CheeseShop}. All packages registered +at the CheeseShop can be installed using \verb+pip+ -- Python's +platform independent package management system\cite{py:pip} -- with: \begin{verbatim} pip install packagename diff --git a/report/chapters/4-testing.tex b/report/chapters/4-testing.tex @@ -39,8 +39,8 @@ Major changes, including the introduction of file locks in the \verb+ComboxDirMonitor+, were made to the \verb+combox.events+. When the unit tests were run OS X, two tests failed, revealing a difference in behavior of watchdog\cite{pylib:watchdog} on GNU/Linux and OS X on -file -creation \footnote{https://git.ricketyspace.net/combox/commit/?id=8c86e7c28738c66c0e04ae7886b44dbcdfc6369exo}; +file creation +\footnote{https://git.ricketyspace.net/combox/commit/?id=8c86e7c28738c66c0e04ae7886b44dbcdfc6369exo}; without unit tests, there is a high probability that this bug would never have been found by now. @@ -59,12 +59,12 @@ these bugs were found when manually testing combox. \section{Manual testing}\label{sec:4-manual-testing} The unit tests for the \verb+combox.events+ module tested the -correctness of the \\ \verb+ComboxDirMonitor+ and \verb+NodeDirMonitor+ -independently. In order to comprehensively test the correctness of -both \verb+ComboxDirMonitor+ and \verb+NodeDirMonitor+, it was -required to manually test combox running on more than one -computer. Several bugs were found and fixed while doing manual -testing. +correctness of the \\ \verb+ComboxDirMonitor+ and +\verb+NodeDirMonitor+ independently. In order to comprehensively test +the correctness of both \verb+ComboxDirMonitor+ and +\verb+NodeDirMonitor+, it was required to manually test combox running +on more than one computer. Several bugs were found and fixed while +doing manual testing. Three different types of setups were used to manually test combox. The first kind of setup has two GNU/Linux machines each using combox to @@ -100,15 +100,15 @@ combox was run on two GNU/Linux machines and a file was alternatively created/modified/renamed/deleted on one of the GNU/Linux machine and it was verified if the respective file was also created/modified/renamed/deleted on the other GNU/Linux machine. One -of the GNU/Linux machines, (\verb+lyra)+, was a virtual machine running -Debian GNU/Linux stable (version 8.x). The other GNU/Linux machine -(\verb+grus+) was a physical machine running Debian GNU/Linux +of the GNU/Linux machines, (\verb+lyra)+, was a virtual machine +running Debian GNU/Linux stable (version 8.x). The other GNU/Linux +machine (\verb+grus+) was a physical machine running Debian GNU/Linux testing. The node directories to scatter the files' shards were the Dropbox directory and Google Drive directory. The official Dropbox client was used to automatically sync files from the Dropbox directory to the Dropbox' data store; \verb+rclone+\cite{program:rclone} was -used to sync files from Google Drive directory to Google Drive' -data store. +used to sync files from Google Drive directory to Google Drive' data +store. \subsubsection{Issues found}\label{ch-4-2gnus-issues} @@ -153,72 +153,74 @@ data store. \subsubsection{Demo} A demo of combox being used on two GNU/Linux machines can be viewed at -\url{https://ricketyspace.net/combox/combox-2-gnus.webm}. \verb+lyra+ (virtual machine) and \verb+grus+ (bare-metal) are the two -GNU/Linux machines being used for the demo. +\url{https://ricketyspace.net/combox/combox-2-gnus.webm}. \verb+lyra+ +(virtual machine) and \verb+grus+ (bare-metal) are the two GNU/Linux +machines being used for the demo. Description of what happens in the demo follows: - - (lyra) install combox. +- (lyra) install combox. - - (lyra) run combox (test mode). +- (lyra) run combox (test mode). - - (lyra) create file \verb+walden.pond+ with content ``It must be - beautiful there''. +- (lyra) create file \verb+walden.pond+ with content ``It must be +beautiful there''. - - (lyra) sync Google Drive using \verb+rclone+. +- (lyra) sync Google Drive using \verb+rclone+. - - (grus) sync Google Drive using \verb+rclone+. +- (grus) sync Google Drive using \verb+rclone+. - - (grus) git pull latest copy of combox. +- (grus) git pull latest copy of combox. - - (grus) install combox +- (grus) install combox - - (grus) run combox (testing mode). +- (grus) run combox (testing mode). - - (grus) verify that \verb+walden.pond+ was create on this machine. +- (grus) verify that \verb+walden.pond+ was create on this machine. - - (grus) append 'Peaceful too.' to \verb+walden.pond+. +- (grus) append 'Peaceful too.' to \verb+walden.pond+. - - (grus) sync Google Drive using \verb+rclone+. +- (grus) sync Google Drive using \verb+rclone+. - - (lyra) sync Google Drive using \verb+rclone+. +- (lyra) sync Google Drive using \verb+rclone+. - - (lyra) verify that the latest copy of \verb+walden.pond+ is there - in the combox directory; it should contain 'Peaceful too.' in the - last line. +- (lyra) verify that the latest copy of \verb+walden.pond+ is there in +the combox directory; it should contain 'Peaceful too.' in the last +line. - - (lyra) append ``I've a dream'' to \verb+walden.pond+. +- (lyra) append ``I've a dream'' to \verb+walden.pond+. - - (lyra) sync Google Drive using \verb+rclone+. +- (lyra) sync Google Drive using \verb+rclone+. - - (grus) sync Google Drive using \verb+rclone+. +- (grus) sync Google Drive using \verb+rclone+. - - (grus) verify that the latest copy of \verb+walden.pond+ is there - in the combox directory; it should contain ``I've a dream'' in the - last line. +- (grus) verify that the latest copy of \verb+walden.pond+ is there in +the combox directory; it should contain ``I've a dream'' in the last +line. - - (grus) remove \verb+walden.pond+ from combox directory. +- (grus) remove \verb+walden.pond+ from combox directory. - - (grus) sync Google Drive using \verb+rclone+. +- (grus) sync Google Drive using \verb+rclone+. - - (lyra) sync Google Drive using \verb+rclone+. +- (lyra) sync Google Drive using \verb+rclone+. - - (lyra) verify that \verb+walden.pond+ is removed from the combox - directory. +- (lyra) verify that \verb+walden.pond+ is removed from the combox +directory. - - (grus) open Dropbox and Google drive accounts from the web browser. +- (grus) open Dropbox and Google drive accounts from the web browser. - - (lyra) create file \verb+manufacturing.consent.+ with content ``Chomsky stuff?''. +- (lyra) create file \verb+manufacturing.consent.+ with content +``Chomsky stuff?''. - - (lyra) sync Google Drive using \verb+rclone+. +- (lyra) sync Google Drive using \verb+rclone+. - - (grus) sync Google Drive using \verb+rclone+. +- (grus) sync Google Drive using \verb+rclone+. - - (grus) verify that \verb+manufacturing.consent+ was created in the - combox directory. +- (grus) verify that \verb+manufacturing.consent+ was created in the +combox directory. - - (grus) verify that the shards of \verb+manufacturing.consent+ were - created on Dropbox and Google Drive through the web browser. +- (grus) verify that the shards of \verb+manufacturing.consent+ were +created on Dropbox and Google Drive through the web browser. \subsection{Testing on a GNU/Linux and an OS X machine} @@ -249,9 +251,10 @@ Google Drive directory to Google Drive's data store on GNU/Linux. stored in the Dropbox directory, it will momentarily disappear before the most updated shard becomes available in the Dropbox directory; this broke combox. This issue was fixed on - 2015-08-25\footnote{https://git.ricketyspace.net/combox/commit/?id=d5b52030348d40600b4c9256f76e5183a85fbb17}. This issue is not got to do with - the nature of the setup but it is related to the Dropbox's behavior - elaborated in section \ref{ch-4-2gnus-issues}. + 2015-08-25\footnote{https://git.ricketyspace.net/combox/commit/?id=d5b52030348d40600b4c9256f76e5183a85fbb17}. This + issue is not got to do with the nature of the setup but it is + related to the Dropbox's behavior elaborated in section + \ref{ch-4-2gnus-issues}. \item When the official Google Drive client pulls an updated version of the file from Google Drive' data store, instead directly updating the respective file on the computer, it deletes the older version of @@ -263,20 +266,22 @@ Google Drive directory to Google Drive's data store on GNU/Linux. behavior\footnote{https://git.ricketyspace.net/combox/commit/?id=37385a90f90cb9d4dfd13d9d2e3cbcace8011e9e}. \item When a non-empty directory was move/renamed on another computer, the old directory was not getting properly deleted on this computer; - this was happening because, sometimes, the files under the - directory being renamed were not deleted when it was time for + this was happening because, sometimes, the files under the directory + being renamed were not deleted when it was time for \verb+NodeDirMonitor+ to \verb+rmdir+ the old directory. This issue was fixed on 2015-09-12\footnote{https://git.ricketyspace.net/combox/commit/?id=9d14db03da5d10d5ab0d7cc76b20e7b1ed5523bf}. \item It was found that \verb+combox.file.rm_path+ function failed when it was given a non-existent path to remove; this issue was - fixed on 2015-09-12\footnote{https://git.ricketyspace.net/combox/commit/?id=422238eb4904de14842221fa09a2b4028801afb1}. + fixed on + 2015-09-12\footnote{https://git.ricketyspace.net/combox/commit/?id=422238eb4904de14842221fa09a2b4028801afb1}. \end{itemize} \subsubsection{Demo} -A demo of combox being used on a GNU/Linux machine and OS X machine can -be viewed at \url{https://ricketyspace.net/combox/combox-gnu-osx.webm} +A demo of combox being used on a GNU/Linux machine and OS X machine +can be viewed at +\url{https://ricketyspace.net/combox/combox-gnu-osx.webm} \verb+lyra+ is the GNU/Linux (virtual) machine and \verb+dhcp-129-1-66-1+ is the OS X machine that is being used for the @@ -284,37 +289,37 @@ demo. The OS X machine is accessed through VNC\cite{article:vnc}. Description of what happens in the demo follows: - - (\verb+lyra+) create file \verb+cat.stevens+ with content ``peace train''. +- (\verb+lyra+) create file \verb+cat.stevens+ with content ``peace +train''. - - (\verb+lyra+) sync Google Drive using \verb+rclone+. +- (\verb+lyra+) sync Google Drive using \verb+rclone+. - - (\verb+dhcp-129-1-66-1+) verify that file \verb+cat.stevens+ is - created with content ``peace train''. +- (\verb+dhcp-129-1-66-1+) verify that file \verb+cat.stevens+ is +created with content ``peace train''. - - (\verb+dhcp-129-1-66-1+) append string ``moonshadow'' to file - \verb+cat.stevens+. +- (\verb+dhcp-129-1-66-1+) append string ``moonshadow'' to file +\verb+cat.stevens+. - - (\verb+lyra+) sync Google Drive using \verb+rclone+. +- (\verb+lyra+) sync Google Drive using \verb+rclone+. - - (\verb+lyra+) verify that the file \verb+cat.stevens+ was updated - (modified); last line must have the string ``moonshadow''. +- (\verb+lyra+) verify that the file \verb+cat.stevens+ was updated +(modified); last line must have the string ``moonshadow''. - - (\verb+lyra+) append string ``father and son'' to the file - \verb+cat.stevens+. +- (\verb+lyra+) append string ``father and son'' to the file +\verb+cat.stevens+. - - (\verb+lyra+) sync Google Drive using \verb+rclone+. +- (\verb+lyra+) sync Google Drive using \verb+rclone+. - - (\verb+dhcp-129-1-66-1+) verify that the file \verb+cat.stevens+ - was updated (modified); last line must have the string ``father and - son''. +- (\verb+dhcp-129-1-66-1+) verify that the file \verb+cat.stevens+ was +updated (modified); last line must have the string ``father and son''. - - (\verb+dhcp-129-1-66-1+) rename file \verb+cat.stevens+ to - \verb+yusuf.islam+ +- (\verb+dhcp-129-1-66-1+) rename file \verb+cat.stevens+ to +\verb+yusuf.islam+ - - (\verb+lyra+) sync Google Drive using \verb+rclone+. +- (\verb+lyra+) sync Google Drive using \verb+rclone+. - - (\verb+lyra+) verify that the file \verb+cat.stevens+ was renamed - to \verb+yusuf.islam+. +- (\verb+lyra+) verify that the file \verb+cat.stevens+ was renamed to +\verb+yusuf.islam+. \subsection{Testing with a USB stick as a node} @@ -358,7 +363,8 @@ files stored in combox directory. \subsubsection{Demo} A demo of combox being used with a USB stick as the third node can be -viewed at \url{https://ricketyspace.net/combox/combox-usb-node-demo.webm} +viewed at +\url{https://ricketyspace.net/combox/combox-usb-node-demo.webm} \verb+grus+ is the GNU/Linux machine and \verb+dhcp-129-1-66-1+ is the OS X machine that is being used for the demo. \verb+ZAPHOD+ is the @@ -366,76 +372,75 @@ FAT32 USB stick used as the third node. Description of what happens in the demo follows: - - (\verb+grus+) start combox. +- (\verb+grus+) start combox. - - (\verb+grus+) create a file called \verb+simon.and.garfunkel+ with - content ``the boxer''. +- (\verb+grus+) create a file called \verb+simon.and.garfunkel+ with +content ``the boxer''. - - (\verb+grus+) sync Google Drive using \verb+rclone+. +- (\verb+grus+) sync Google Drive using \verb+rclone+. - - (\verb+grus+) stop combox. +- (\verb+grus+) stop combox. - - (\verb+grus+) unmount USB stick (\verb+ZAPHOD+) from \verb+grus+. +- (\verb+grus+) unmount USB stick (\verb+ZAPHOD+) from \verb+grus+. - - (\verb+dhcp-129-1-66-1+) mount USB stick (\verb+ZAPHOD+) to - (\verb+dhcp-129-1-66-1+). +- (\verb+dhcp-129-1-66-1+) mount USB stick (\verb+ZAPHOD+) to +(\verb+dhcp-129-1-66-1+). - - (\verb+dhcp-129-1-66-1+) start Dropbox client. +- (\verb+dhcp-129-1-66-1+) start Dropbox client. - - (\verb+dhcp-129-1-66-1+) start Google Drive client. +- (\verb+dhcp-129-1-66-1+) start Google Drive client. - - (\verb+dhcp-129-1-66-1+) start combox. +- (\verb+dhcp-129-1-66-1+) start combox. - - (\verb+dhcp-129-1-66-1+) verify that the file - \verb+simon.and.garfunkel+ with content ``the boxer'' was created. +- (\verb+dhcp-129-1-66-1+) verify that the file +\verb+simon.and.garfunkel+ with content ``the boxer'' was created. - - (\verb+dhcp-129-1-66-1+) append string ``mrs. robinson'' to file - \verb+simon.and.garfunkel+. +- (\verb+dhcp-129-1-66-1+) append string ``mrs. robinson'' to file +\verb+simon.and.garfunkel+. - - (\verb+dhcp-129-1-66-1+) stop combox. +- (\verb+dhcp-129-1-66-1+) stop combox. - - (\verb+dhcp-129-1-66-1+) stop Google Drive client. +- (\verb+dhcp-129-1-66-1+) stop Google Drive client. - - (\verb+dhcp-129-1-66-1+) stop Dropbox client. +- (\verb+dhcp-129-1-66-1+) stop Dropbox client. - - (\verb+dhcp-129-1-66-1+) unmount the USB stick (\verb+ZAPHOD+) - from (\verb+dhcp-129-1-66-1+). +- (\verb+dhcp-129-1-66-1+) unmount the USB stick (\verb+ZAPHOD+) from +(\verb+dhcp-129-1-66-1+). - - (\verb+grus+) mount the USB stick (\verb+ZAPHOD+) to - (\verb+grus+). +- (\verb+grus+) mount the USB stick (\verb+ZAPHOD+) to (\verb+grus+). - - (\verb+grus+) start combox. +- (\verb+grus+) start combox. - - (\verb+grus+) start Dropbox client. +- (\verb+grus+) start Dropbox client. - - (\verb+grus+) sync Google Drive using \verb+rclone+. +- (\verb+grus+) sync Google Drive using \verb+rclone+. - - (\verb+grus+) touch \verb+simon.and.garfunkel.shard2+ in the USB - stick (\verb+ZAPHOD+). +- (\verb+grus+) touch \verb+simon.and.garfunkel.shard2+ in the USB +stick (\verb+ZAPHOD+). - - (\verb+grus+) verify that the file \verb+simon.and.garfunkel+ is - updated; the last line must contain the string ``mrs. robinson''. +- (\verb+grus+) verify that the file \verb+simon.and.garfunkel+ is +updated; the last line must contain the string ``mrs. robinson''. - - (\verb+grus+) remove the file \verb+simon.and.garfunkel+. +- (\verb+grus+) remove the file \verb+simon.and.garfunkel+. - - (\verb+grus+) sync Google Drive using \verb+rclone+. +- (\verb+grus+) sync Google Drive using \verb+rclone+. - - (\verb+grus+) unmount the USB stick (\verb+ZAPHOD+) from - (\verb+grus+). +- (\verb+grus+) unmount the USB stick (\verb+ZAPHOD+) from +(\verb+grus+). - - (\verb+grus+) stop Dropbox client. +- (\verb+grus+) stop Dropbox client. - - (\verb+dhcp-129-1-66-1+) mount the USB stick (\verb+ZAPHOD+) to - (\verb+dhcp-129-1-66-1+). +- (\verb+dhcp-129-1-66-1+) mount the USB stick (\verb+ZAPHOD+) to +(\verb+dhcp-129-1-66-1+). - - (\verb+dhcp-129-1-66-1+) start Google Drive client. +- (\verb+dhcp-129-1-66-1+) start Google Drive client. - - (\verb+dhcp-129-1-66-1+) start Dropbox client. +- (\verb+dhcp-129-1-66-1+) start Dropbox client. - - (\verb+dhcp-129-1-66-1+) start combox. +- (\verb+dhcp-129-1-66-1+) start combox. - - (\verb+dhcp-129-1-66-1+) verify that the file - \verb+simon.and.garfunkel+ was deleted. +- (\verb+dhcp-129-1-66-1+) verify that the file +\verb+simon.and.garfunkel+ was deleted. \section{Stress testing} @@ -443,164 +448,165 @@ Description of what happens in the demo follows: A large number of files of different sizes were dumped to the combox directory between an one second interval to see how combox responds to high load. The file dump size was varied from \verb+424.80MiB+ (27 -files) to \verb+10,800.00MiB+ (180 files). The average time taken -to split a file and the total time to process all files were -calculated for each dump. +files) to \verb+10,800.00MiB+ (180 files). The average time taken to +split a file and the total time to process all files were calculated +for each dump. Stress testing was first done on \verb+2015-11-08+. In mid November -2015, the \\ \verb+ComboxDirMonitor+ was drastically modified to make it -use the file Lock shared by the instances of +2015, the \\ \verb+ComboxDirMonitor+ was drastically modified to make +it use the file Lock shared by the instances of \verb+NodeDirMonitor+\footnote{https://git.ricketyspace.net/combox/commit/?id=5aa1ba0c1dcad62931ba27bb66bf115233086d6c}. -The hypothesis was that this change in \verb+ComboxDirMonitor+ directly -affected the performance of combox and therefore the results that were -got from stress testing on \verb+2015-11-08+ would no longer be -valid. Stress testing was again done on \verb+2016-01-16+. The results -of this stress test are in sections \ref{4-st-424} to +The hypothesis was that this change in \verb+ComboxDirMonitor+ +directly affected the performance of combox and therefore the results +that were got from stress testing on \verb+2015-11-08+ would no longer +be valid. Stress testing was again done on \verb+2016-01-16+. The +results of this stress test are in sections \ref{4-st-424} to \ref{4-st-10800}. Section \ref{4-st-tu} gives information about the tools used for stress testing, section \ref{4-st-o} contains the observations and comparisons between this stress test and the one done -on \verb+2015-11-08+, and, lastly section \ref{4-st-if} reveals the issues -that were found with combox by virtue of doing the stress tests. +on \verb+2015-11-08+, and, lastly section \ref{4-st-if} reveals the +issues that were found with combox by virtue of doing the stress +tests. \subsection{flac dump (27 files - 424.80MiB)}\label{4-st-424} \begin{center} -\begin{table}[h] -\begin{tabular}{ll} -field & value\\ -\hline -delay between a file dump & 1s\\ -start time of processing & 11:00:54\\ -end time of processing & 11:01:38\\ -total time taken to process all files & 00:00:44\\ -no. of files & 27\\ -total size of all files & 445433187.00 bytes (424.79MiB)\\ -avg. file size & 16497525.00 bytes (15.73MiB)\\ -avg. time to split and encrypt a file & 352.58 ms\\ -\end{tabular} -\caption{Stress Testing combox - flac dump (27 files - 424.79MiB) to combox directory} -\end{table} + \begin{table}[h] + \begin{tabular}{ll} + field & value\\ + \hline + delay between a file dump & 1s\\ + start time of processing & 11:00:54\\ + end time of processing & 11:01:38\\ + total time taken to process all files & 00:00:44\\ + no. of files & 27\\ + total size of all files & 445433187.00 bytes (424.79MiB)\\ + avg. file size & 16497525.00 bytes (15.73MiB)\\ + avg. time to split and encrypt a file & 352.58 ms\\ + \end{tabular} + \caption{Stress Testing combox - flac dump (27 files - 424.79MiB) to combox directory} + \end{table} \end{center} \subsubsection{Differences from previous stress test (2015-11-08)} \begin{itemize} \item Total time to process all files was faster by 1min3secs. -\item Average time to split and encrypt a file reduced by - 28.33ms. +\item Average time to split and encrypt a file reduced by 28.33ms. \end{itemize} -\subsection{20MiB - 90MiB dump (27 files - 1620.00MiB)}\label{4-st-1620} +\subsection{20MiB - 90MiB dump (27 files - + 1620.00MiB)}\label{4-st-1620} \begin{center} -\begin{table}[h] -\begin{tabular}{ll} -field & value\\ -\hline -delay between a file dump & 1s\\ -start time of processing & 12:26:45\\ -end time of processing & 12:29:07\\ -total time taken to process all files & 00:02:22\\ -no. of files & 27\\ -total size of all files & 1698693120.00 bytes (1620.00MiB)\\ -avg. file size & 62914560.00 bytes (60.00iB)\\ -avg. time to split and encrypt a file & 2670.59ms\\ -\end{tabular} -\caption{Stress Testing combox - 20MiB - 90MiB dump (27 files - 1620.00MiB) to combox directory} -\end{table} + \begin{table}[h] + \begin{tabular}{ll} + field & value\\ + \hline + delay between a file dump & 1s\\ + start time of processing & 12:26:45\\ + end time of processing & 12:29:07\\ + total time taken to process all files & 00:02:22\\ + no. of files & 27\\ + total size of all files & 1698693120.00 bytes (1620.00MiB)\\ + avg. file size & 62914560.00 bytes (60.00iB)\\ + avg. time to split and encrypt a file & 2670.59ms\\ + \end{tabular} + \caption{Stress Testing combox - 20MiB - 90MiB dump (27 files - 1620.00MiB) to combox directory} + \end{table} \end{center} \subsubsection{Differences from previous stress test (2015-11-08)} \begin{itemize} \item Total time to process all files was slower by 4secs. -\item Average time to split and encrypt a file reduced by - 25.52ms. +\item Average time to split and encrypt a file reduced by 25.52ms. \end{itemize} -\subsection{20MiB - 90MiB dump (99 files - 5940.00MiB)}\label{4-st-5940} +\subsection{20MiB - 90MiB dump (99 files - + 5940.00MiB)}\label{4-st-5940} \begin{center} -\begin{table}[h] -\begin{tabular}{ll} -field & value\\ -\hline -delay between a file dump & 1s\\ -start time of processing & 13:10:16\\ -end time of processing & 13:19:26\\ -total time taken to process all files & 00:09:10\\ -no. of files & 99\\ -total size of all files & 6228541440.00 bytes (5940.00MiB)\\ -avg. file size & 62914560.00 bytes (60.00MiB)\\ -avg. time to split and encrypt a file & 2979.64ms\\ -\end{tabular} -\caption{Stress Testing combox - 20MiB - 90MiB dump (99 files - 5940.00MiB) - to combox directory} -\end{table} + \begin{table}[h] + \begin{tabular}{ll} + field & value\\ + \hline + delay between a file dump & 1s\\ + start time of processing & 13:10:16\\ + end time of processing & 13:19:26\\ + total time taken to process all files & 00:09:10\\ + no. of files & 99\\ + total size of all files & 6228541440.00 bytes (5940.00MiB)\\ + avg. file size & 62914560.00 bytes (60.00MiB)\\ + avg. time to split and encrypt a file & 2979.64ms\\ + \end{tabular} + \caption{Stress Testing combox - 20MiB - 90MiB dump (99 files - 5940.00MiB) - to combox directory} + \end{table} \end{center} \subsubsection{Differences from previous stress test (2015-11-08)} \begin{itemize} \item Total time to process all files was faster by 59secs. -\item Average time to split and encrypt a file increased by - 206.20ms. +\item Average time to split and encrypt a file increased by 206.20ms. \end{itemize} -\subsection{20MiB - 90MiB dump (180 files - 10800.00MiB)}\label{4-st-10800} +\subsection{20MiB - 90MiB dump (180 files - + 10800.00MiB)}\label{4-st-10800} \begin{center} -\begin{table}[h] -\begin{tabular}{ll} -field & value\\ -\hline -delay between a file dump & 1s\\ -start time of processing & 13:42:06\\ -end time of processing & 14:00:10\\ -total time taken to process all files & 00:18:04\\ -no. of files & 180\\ -total size of all files & 11324620800.00 bytes (10800.00MiB)\\ -avg. file size & 62914560.00 bytes (60.00MiB)\\ -avg. time to split and encrypt a file & 3423.08ms\\ -\end{tabular} -\caption{Stress Testing combox - 20MiB - 90MiB dump (180 files - 10800.00MiB) to combox directory} -\end{table} + \begin{table}[h] + \begin{tabular}{ll} + field & value\\ + \hline + delay between a file dump & 1s\\ + start time of processing & 13:42:06\\ + end time of processing & 14:00:10\\ + total time taken to process all files & 00:18:04\\ + no. of files & 180\\ + total size of all files & 11324620800.00 bytes (10800.00MiB)\\ + avg. file size & 62914560.00 bytes (60.00MiB)\\ + avg. time to split and encrypt a file & 3423.08ms\\ + \end{tabular} + \caption{Stress Testing combox - 20MiB - 90MiB dump (180 files - 10800.00MiB) to combox directory} + \end{table} \end{center} \subsubsection{Differences from previous stress test (2015-11-08)} \begin{itemize} \item Total time to process all files was slower by 1min2secs -\item Average time to split and encrypt a file increased by - 399.87ms. +\item Average time to split and encrypt a file increased by 399.87ms. \end{itemize} \subsection{Tools used}\label{4-st-tu} -The \verb+dump+ script\footnote{https://git.ricketyspace.net/combox-paper/plain/dumper/dump} was used to dump files to -the combox directory between one second intervals. A night of Emacs -Lisp indulgence made it possible to quickly slurp the required data -from the combox output and calculate the average time to split and -encrypt a file and the total amount of time taken to process the files -for a given dump\footnote{https://git.ricketyspace.net/combox-paper/plain/scripts/dumps.el}; lastly \verb+org-mode+ was -used to document all data gathered during stress +The \verb+dump+ +script\footnote{https://git.ricketyspace.net/combox-paper/plain/dumper/dump} +was used to dump files to the combox directory between one second +intervals. A night of Emacs Lisp indulgence made it possible to +quickly slurp the required data from the combox output and calculate +the average time to split and encrypt a file and the total amount of +time taken to process the files for a given +dump\footnote{https://git.ricketyspace.net/combox-paper/plain/scripts/dumps.el}; +lastly \verb+org-mode+ was used to document all data gathered during +stress testing\footnote{https://git.ricketyspace.net/combox-paper/plain/notes/benchmarks.org}. \subsection{Observations}\label{4-st-o} \begin{figure}[h] -\centering -\input{graphs/tot-time.tex} -\caption{Stress testing combox - Observations - Time taken to process - all files in a given file dump.} -\label{fig:4-st-tt} + \centering \input{graphs/tot-time.tex} + \caption{Stress testing combox - Observations - Time taken to + process all files in a given file dump.} + \label{fig:4-st-tt} \end{figure} \begin{figure}[h] -\centering -\input{graphs/avg-time-sae.tex} -\caption{Stress testing combox - Observations - Avg. time to split and - encrypt a file in a given file dump.} -\label{fig:4-st-atsae} + \centering \input{graphs/avg-time-sae.tex} + \caption{Stress testing combox - Observations - Avg. time to split + and encrypt a file in a given file dump.} + \label{fig:4-st-atsae} \end{figure} @@ -614,25 +620,26 @@ testing\footnote{https://git.ricketyspace.net/combox-paper/plain/notes/benchmark file dump'' is the total size of all files in a given file dump.}. \item Figure \ref{fig:4-st-atsae} show the average time it takes combox to split and encrypt a file for a given file dump. There is a - steep increase in the average time from the \verb+424.79MiB+ - dump and the \verb+1620.00MiB+ dump, after which the average - time to split and encrypt a file seems to almost linearly increase; - The main reason for this is that the average file size for dumps - from \verb+1620.00MiB+ to \verb+10800.00MiB+ are the same. + steep increase in the average time from the \verb+424.79MiB+ dump + and the \verb+1620.00MiB+ dump, after which the average time to + split and encrypt a file seems to almost linearly increase; The main + reason for this is that the average file size for dumps from + \verb+1620.00MiB+ to \verb+10800.00MiB+ are the same. \end{itemize} \begin{figure}[h] -\centering -\input{graphs/tot-time-diff.tex} -\caption{Stress testing combox - Difference between 2015 and 2016 tests - time taken to process all files in a given file dump.} -\label{fig:4-st-tt-diff} + \centering \input{graphs/tot-time-diff.tex} + \caption{Stress testing combox - Difference between 2015 and 2016 + tests - time taken to process all files in a given file dump.} + \label{fig:4-st-tt-diff} \end{figure} \begin{figure}[h] -\centering -\input{graphs/avg-time-sae-diff.tex} -\caption{Stress testing combox - Difference between 2015 and 2016 tests - Avg. time to split and encrypt a file in a given file dump.} -\label{fig:4-st-atsae-diff} + \centering \input{graphs/avg-time-sae-diff.tex} + \caption{Stress testing combox - Difference between 2015 and 2016 + tests - Avg. time to split and encrypt a file in a given file + dump.} + \label{fig:4-st-atsae-diff} \end{figure} \begin{itemize} @@ -640,18 +647,18 @@ testing\footnote{https://git.ricketyspace.net/combox-paper/plain/notes/benchmark amount of time taken to process all files for a given file dump in the \verb+2016-01-16+ and \verb+2015-11-8+ stress test. The amount of time needed to process all fills seems to be reduced for the - \verb+5940.00MiB+ file dump when compared to the \verb+2015+ - stress test results and it seems to be slightly higher for the - \verb+10800.00MiB+ file dump when compared to the \verb+2015+ - stress test. + \verb+5940.00MiB+ file dump when compared to the \verb+2015+ stress + test results and it seems to be slightly higher for the + \verb+10800.00MiB+ file dump when compared to the \verb+2015+ stress + test. \item Similarly, figure \ref{fig:4-st-atsae-diff} shows the graphs for the average time to split and encrypt for a given file dump in the \verb+2016-01-16+ and the \verb+2015-11-8+ stress test. The average - time taken seems to be almost the same for the \verb+424.79MiB+ - and the \verb+1620.00+ dump, but for the \verb+5940.00MiB+ - and the \verb+10800.00MiB+ dump the average time taken seems to - higher for the \verb+2016+ stress test when compared to the - \verb+2015+ stress test. + time taken seems to be almost the same for the \verb+424.79MiB+ and + the \verb+1620.00+ dump, but for the \verb+5940.00MiB+ and the + \verb+10800.00MiB+ dump the average time taken seems to higher for + the \verb+2016+ stress test when compared to the \verb+2015+ stress + test. \end{itemize} \subsection{Issues found}\label{4-st-if} @@ -670,4 +677,4 @@ testing\footnote{https://git.ricketyspace.net/combox-paper/plain/notes/benchmark \verb+ComboxDirMonitor+ and \verb+NodeDirMonitor+\footnote{https://git.ricketyspace.net/combox/commit?id=7ed3c9cbe6e56223b043a23408474f9df08f119e}, this fixed the issue. -\end{itemize}- \ No newline at end of file +\end{itemize} diff --git a/report/chapters/5-con-f.tex b/report/chapters/5-con-f.tex @@ -10,9 +10,9 @@ combox is at a stage where it can be used as a tool to use the storage provided by two file storage providers -- Google Drive and Dropbox -- such that only part of each file in the encrypted form is stored on the data store of the file storage providers. This method of storing -files on file storage providers makes it difficult, but not impossible, -for file storage providers or ``third parties'' to gain access to the -user's personal files. +files on file storage providers makes it difficult, but not +impossible, for file storage providers or ``third parties'' to gain +access to the user's personal files. combox is at version 0.2.3, it is a python package licensed under the GNU General Public License version 3 or later. It is compatible with @@ -37,7 +37,9 @@ follows is a non-exhaustive list of things to do in the future: directory. At the moment, combox reads the amount of free space available on each node directory (file storage provider's directory) when configuring combox on a computer but does not use this - information to reckon the space left in each node directory. The major issue here is how to determine what space is available without interacting with a service provider's API or asking the end user. + information to reckon the space left in each node directory. The + major issue here is how to determine what space is available without + interacting with a service provider's API or asking the end user. \item Re-think \verb+combox.events+ module. This module was written @@ -63,11 +65,11 @@ follows is a non-exhaustive list of things to do in the future: needs to be written for supporting a new file storage provider, combox must be tested with the new file storage provider's directory as a node directory. If the new file storage provider's client (that - sync's the shards their data store) makes non-standard changes to its - directory (like the official Dropbox and Google Drive clients do), - then the \verb+combox.events.NodeDirMonitor+ must be accordingly - updated to make combox cognizant about the file storage provider - client's non-standard behavior. + sync's the shards their data store) makes non-standard changes to + its directory (like the official Dropbox and Google Drive clients + do), then the \verb+combox.events.NodeDirMonitor+ must be + accordingly updated to make combox cognizant about the file storage + provider client's non-standard behavior. \item Make unit tests more modular. At the moment, there are some unit test functions that test more than one usecase/facet of a function @@ -92,4 +94,3 @@ follows is a non-exhaustive list of things to do in the future: contains information about setting up the development environment for combox on Windows. \end{itemize} -