summaryrefslogtreecommitdiffstats
path: root/report/chapters/1-intr.tex
blob: 9f32e11f3c9416e81150c01aef20352ed1d273c7 (plain) (blame)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
\chapter{Introduction}

\epigraph{From a security perspective, if you're connected, you're
  screwed.}{\textit{Daniel J. Bernstein}}

Internet companies have made it trivial for computer users to store
data/information on their servers and at the same time there is a lot
of evidence of governments and other powerful organizations being able
to access information/data stored on the Internet companies' computers
\cite{website:wikileaks-spyfiles}. Also, most companies add a standard
clause in their privacy policy that allow them to disclose information
about users or information stored/created by users to ``third
parties'':

\begin{quote}
  \emph{Law \& Order}. We may disclose your information to third
  parties if we determine that such disclosure is reasonably necessary
  to (a) comply with the law; (b) protect any person from death or
  serious bodily injury; (c) prevent fraud or abuse of Dropbox or our
  users; or (d) protect Dropbox's property rights. -- Dropbox Privacy
  Policy \cite{website:dropbox-privacy}
\end{quote}

In this type of world, it would be good to have a program that would
encrypt all the data/information before storing it on the storage
provided by Internet companies. combox aims to be one such program
which not only encrypts but stores only a part of the encrypted
data/information on the storage provided by an Internet company, thus
making it non-trivial for ``third parties'' to access the user's
data/information in its entirety. Section \ref{1-sec-cb} gives a
conceptual introduction to combox; Section \ref{1-sec-cb-diff}
enumerates how combox is different from Vollmar's Combo-Box; lastly,
section \ref{1-sec-using-cb} contains information on how one can start
using combox.

\section{What is combox?}\label{1-sec-cb}

combox allows the user to store all of their files in the ``combox
directory'' and combox picks each file stored in the combox directory,
splits them into $N$ shards, encrypts each of the $N$ shards and
spreads the shards to $N$ node directories. A ``node directory'' is
the directory of the file storage provider (Dropbox directory is a
node directory). Fig. \ref{fig:1-combox-overview-0}, illustrates how a
file called \verb+strunk-white.pdf+ is split, encrypted and spread
across $N$ node directories; shards \verb+strunk-white.pdf.shard0+ to
\verb+strunk-white.pdf.shardN+ are encrypted.

\begin{figure}[h]
\begin{verbatim}

                                  __________________________
                                  |                         |
                               -->| strunk-white.pdf.shard0 |
                               |  |                         |
         ___________________   |  |_________________________|
         |                  |  |  node directory 0
         | strunk-white.pdf | /
         |                  | |   __________________________
         |__________________| |\  |                         |
         combox directory     ||  | strunk-white.pdf.shard1 |
                              ||->|                         |
                              |   |_________________________|
                              |   node directory 1
                              |           .
                              |           .
                              |           .
                              |
                              |   __________________________
                              |   |                         |
                              --->| strunk-white.pdf.shardN |
                                  |                         |
                                  |_________________________|
                                  node directory N
\end{verbatim}
  \caption{combox overview - Splitting a file and spreading it across
    N node directories.}
  \label{fig:1-combox-overview-0}
\end{figure}

combox does not use the API provided by the file storage providers to
sync encrypted shards stored in the node directories to the respective
file storage providers' data store. Instead, it depends on the
respective file storage provider's client program to sync the shards.

combox can be used on all of the user's computers. For instance, the
user can install combox on their second computer and combox will
reconstruct the file from the encrypted shards stored in the node
directories into the combox directory on their second computer;
Fig. \ref{fig:1-combox-overview-1} illustrates this. Here too, combox
depends on the client program of the respective file storage provider
to sync shards to/from the file storage provider's data store and
to/from the respective node directory on the user's computer.

\begin{figure}[h]
\begin{verbatim}

       __________________________
       |                         |
       | strunk-white.pdf.shard0 |
       |                         |\
       |_________________________| \   ___________________
       node directory 0             \  |                  |
                                    |->| strunk-white.pdf |
       __________________________  |-->|                  |
       |                         | | ->|__________________|
       | strunk-white.pdf.shard1 |-- | combox directory
       |                         |   |
       |_________________________|   |
       node directory 1              |
               .                     |
               .                     |
               .                     |
                                     |
       __________________________    |
       |                         |   |
       | strunk-white.pdf.shardN |----
       |                         |
       |_________________________|
       node directory N

\end{verbatim}
  \caption{combox overview - Reconstructing a file from the encrypted
    shards.}
  \label{fig:1-combox-overview-1}
\end{figure}

As of combox version \verb+0.2.3+, combox is compatible on GNU/Linux
and OS X, it supports just two file storage providers -- Google Drive
and Dropbox.

\section{How is combox different from Combo-Box?}\label{1-sec-cb-diff}

Combo-Box by Wesley Vollmar \cite{vollmar-combo-box} was the first
implementation of the idea of storing encrypted shards of a file on
storage provided different file storage providers and depending on the
file storage provider's client to sync shards to their respective data
store. Differences between Vollmar's Combo-Box and combox are
enumerated below:

\begin{description}
\item[Platform] Combo-Box runs on Microsoft Windows, whereas combox
  runs on GNU/Linux and OS X and is not compatible with Microsoft
  Windows as of version 0.2.3.
\item[File splitting] Combo-Box splits a file into shards based on the
  space available on each node directory \cite{vollmar-combo-box},
  while combox is not yet cognizant about space left on each node
  directory and splits the file into $N$ equal shards, where $N$ is
  equal to the number of node directories.
\item[User Interface] Combo-Box is a graphical application while
  combox is mostly a command-line program. combox's configuration
  wizard has a graphical interface. The configuration wizard has a
  command-line interface too for users who like TUI.
\item[Database] Combo-Box uses a traditional SQL database with two
  tables to keep track of files' shards, files' hash, files' last
  ``sync time'' and for ``security and stability'' uses stored
  procedures that retrieve/store information in the database
  \cite{vollmar-combo-box}.

  combox on the other hand uses a key-value data store to track the
  files stored in the combox directory using the pickleDB library
  \cite{pylib:pickledb}. The key-value data store is a JSON file and
  all access to this data store is done through an instance of
  \verb+combox.silo.ComboxSilo+ class
  \footnote{https://git.ricketyspace.net/combox/tree/combox/silo.py?id=fb7fdd218\#n29}
  which ensures that only one thread can read from or write to the
  data store at any time through a lock (\verb+threading.Lock+). In
  the data store, combox keeps track of the hashes of all the files
  stored in the combox directory; the data store also contains
  dictionaries that track number of shards which have been
  create/moved/modified/deleted on another computer.

\item[Installation] Combo-Box uses the proprietary InstallShield
  \cite{nonfree-installshield} to install the program, setup shortcuts
  and registry settings \cite{vollmar-combo-box}.

  combox is a python package, it can either be installed through
  python's package manager (\verb+pip+ \cite{py:pip}) with
  \verb+pip install combox+ or it can be installed from the source
  with the standard \verb+python setup.py install+.

\item[Configuration] Combo-Box saves its configuration inside the
  Combo-Box directory and this configuration is shared by all
  computers on which the user chooses to run Combo-Box, by virtue of
  this, the file providers' directories and the Combo-Box directory
  must be in the same locations on all the computers.

  combox stores its configuration at
  \verb+$HOME/.combox/config.yaml+. The configuration file is not
  shared on computers on which the user runs combox. This makes it
  possible to keep the combox directory and the directories of the
  file storage providers' (node directories) in different locations on
  each computer. The configuration file is a YAML file and can be
  directly edited by the user if they wish to.
\end{description}

\section{Using combox}\label{1-sec-using-cb}

Installing and running combox is relatively easy for Unix users:

\begin{verbatim}
   $ pip install combox
   $ combox
\end{verbatim}

For detailed information on installing combox, see \\
https://ricketyspace.net/combox/setup/.

\subsection{Caveats}

combox is extremely event-driven and depends on filesystem events to
do the correct action when a file is created/modified/moved/deleted,
so the user must make sure to start combox before starting the file
storage providers' client programs that sync encrypted shards to the
respective node directories. On GNU/Linux distributions this can be
automated through the distribution's start-up system (most GNU/Linux
distributions seem to use \verb+systemd+ \cite{website:systemd}).