summaryrefslogtreecommitdiffstats
path: root/report/chapters/2-lit-r.tex
diff options
context:
space:
mode:
Diffstat (limited to 'report/chapters/2-lit-r.tex')
-rw-r--r--report/chapters/2-lit-r.tex168
1 files changed, 85 insertions, 83 deletions
diff --git a/report/chapters/2-lit-r.tex b/report/chapters/2-lit-r.tex
index 4e26923..28e24c7 100644
--- a/report/chapters/2-lit-r.tex
+++ b/report/chapters/2-lit-r.tex
@@ -5,32 +5,33 @@
The idea of unifying the storage provided by multiple Internet file
storage providers and storing all the content in an encrypted form is
-not new. In the past, computer researchers and programmers have devised different
-methods to use multiple file storage providers' storage space. This
-chapter gives an overview of the work done by Yeo et al. in unifying
-the storage provided by Dropbox, Box, Google Drive and Skydrive on
-Android devices \cite{yeo}(Section \ref{2-yeo-sec}); SkyCDS, a content
-delivery service, by Gonzalez et al., which uses publish/subscribe
-overlay paradigm and stores the content across multiple cloud storage
-providers such that only part of the content (in encrypted form) is
-stored on each file storage provider \cite{skycds}(Section
-\ref{2-skycds-sec}); and, lastly, \verb+git-annex+, by Joey
-Hess\cite{person:joeyh}, that allows one to version control and keep
-track of large files with a possibility of encrypting files that are
-stored in ``special remotes'' -- storage provided by Internet file
-storage providers (Section \ref{2-gitannex-sec}).
+not new. In the past, computer researchers and programmers have
+devised different methods to use multiple file storage providers'
+storage space. This chapter gives an overview of the work done by Yeo
+et al. in unifying the storage provided by Dropbox, Box, Google Drive
+and Skydrive on Android devices \cite{yeo}(Section \ref{2-yeo-sec});
+SkyCDS, a content delivery service, by Gonzalez et al., which uses
+publish/subscribe overlay paradigm and stores the content across
+multiple cloud storage providers such that only part of the content
+(in encrypted form) is stored on each file storage provider
+\cite{skycds}(Section \ref{2-skycds-sec}); and, lastly,
+\verb+git-annex+, by Joey Hess\cite{person:joeyh}, that allows one to
+version control and keep track of large files with a possibility of
+encrypting files that are stored in ``special remotes'' -- storage
+provided by Internet file storage providers (Section
+\ref{2-gitannex-sec}).
\section{Multi Cloud Storage Prototype}\label{2-yeo-sec}
-In the paper ``Leveraging client-side storage techniques for
-enhanced use of multiple consumer cloud storage services on
+In the paper ``Leveraging client-side storage techniques for enhanced
+use of multiple consumer cloud storage services on
resource-constrained mobile devices'', Yeo et al. show their Android
mobile application, a prototype, which unifies storage provided by
Dropbox, Box, Google Drive and SkyDrive. The application allows the
user to store all their information in a single location on their
-phone and it uses erasure coding \cite{weatherspoon} to split each file
-into \verb`n + k` fragments and spreads the encrypted fragments across
-storage provided by the file storage providers. All basic file
+phone and it uses erasure coding \cite{weatherspoon} to split each
+file into \verb`n + k` fragments and spreads the encrypted fragments
+across storage provided by the file storage providers. All basic file
operations -- Create, Rename, Update, Delete (CRUD) -- are
possible. Information about the files stored in the unified location
is stored in a SQLite database. Unlike combox, which depends the file
@@ -51,30 +52,30 @@ feature in a ``resource constrained'' device where storage is
expensive.
Yeo et al. propose methods for achieving data de-duplication; file
-compression based on file type; intelligent pre-fetching
-and caching of file fragments and ``automatic restoration in
-exploiting file-versioning''. These features were not implemented in
-the prototype Android application and there is possibility of Yeo et
+compression based on file type; intelligent pre-fetching and caching
+of file fragments and ``automatic restoration in exploiting
+file-versioning''. These features were not implemented in the
+prototype Android application and there is possibility of Yeo et
al. implementing these features in the future.
-It becomes apparent that Yeo et al. work is of immense importance. This is particularly true when
-we taking into consideration the research done by Yang et al., which
-found that 59\% of the users who use ``cloud storage service'' access
-the service through a smart phone and 42.2\% users access it for
-audio/video \cite{yang}. The research by Yang et al.
-suggests a trend of users' preference for small hand-held computers
-over laptops and desktops.
+It becomes apparent that Yeo et al. work is of immense
+importance. This is particularly true when we taking into
+consideration the research done by Yang et al., which found that 59\%
+of the users who use ``cloud storage service'' access the service
+through a smart phone and 42.2\% users access it for audio/video
+\cite{yang}. The research by Yang et al. suggests a trend of users'
+preference for small hand-held computers over laptops and desktops.
\section{SkyCDS}\label{2-skycds-sec}
SkyCDS, by Gonzalez et al., is a content delivery system that splits
-and spreads the content across multiple file storage
-providers \cite{skycds}. According to Gonzalez et al., the main reason
-for designing and developing SkyCDS was to prevent content providers
-from getting locked into just one file storage provider and to
-minimize loss when a file storage provider goes out of business or if
-there is temporary outage in the storage service provided by the file
-storage provider.
+and spreads the content across multiple file storage providers
+\cite{skycds}. According to Gonzalez et al., the main reason for
+designing and developing SkyCDS was to prevent content providers from
+getting locked into just one file storage provider and to minimize
+loss when a file storage provider goes out of business or if there is
+temporary outage in the storage service provided by the file storage
+provider.
In SkyCDS, the content delivery to subscribers of the content is
segregated into two distinct layers -- Metadata Flow Layer and the
@@ -91,11 +92,12 @@ responsible for publishing the content using the ``delivery workflow''
When content has to be dispersed to $k$ file storage providers, the
content is split into $n$ chunks, $n > k$. This file splitting seems
to produce 66.7\% of redundancy overhead \cite{skycds}. This file
-splitting scheme also looks very similar to erasure coding, but Gonzalez et
-al. don't explicitly state that the content splitting scheme is indeed
-``erasure coding''. The splitting of content is done by the ``delivery
-workflow'' engine which is invoked when the publisher triggers the
-action to publish the respective content to subscribers.
+splitting scheme also looks very similar to erasure coding, but
+Gonzalez et al. don't explicitly state that the content splitting
+scheme is indeed ``erasure coding''. The splitting of content is done
+by the ``delivery workflow'' engine which is invoked when the
+publisher triggers the action to publish the respective content to
+subscribers.
To evaluate the effectiveness of SkyCDS, Gonzalez et al. state that
they've done a case study using the data obtained from the European
@@ -110,9 +112,9 @@ space and reliability.
\verb+git-annex+ allows one to version controlled large files that are
not usually feasible to version control under
-\verb+git+\cite{program:git}. \verb+git-annex+ checks in the name
-and other meta-data about the files in git and stores the actual
-content under \verb+.git/annex+ directory. When a file is added to
+\verb+git+\cite{program:git}. \verb+git-annex+ checks in the name and
+other meta-data about the files in git and stores the actual content
+under \verb+.git/annex+ directory. When a file is added to
\verb+git-annex+, a symlink of the file is created in place of the
file and the content of the file itself is stored under the
\verb+.git/annex+ directory.
@@ -148,7 +150,7 @@ add deb-nicholson-80s.medium.webm ok
↳ ls -l
...
-lrwxrwxrwx 1 rsd rsd 207 May 5 2015 deb-nicholson-80s.medium.webm
+lrwxrwxrwx 1 rsd rsd 207 May 5 2015 deb-nicholson-80s.medium.webm
-> ../.git/annex/objects/3j/vG/SHA256E-s108196923--7de9484ee96908268e
21b451eb9805552c32b44da08e70ee861332c87352944f.webm/SHA256E-s10819692
3--7de9484ee96908268e21b451eb9805552c32b44da08e70ee861332c87352944f.w
@@ -162,12 +164,12 @@ ebm
}
Now, the file \verb+deb-nicholson-80s.medium.webm+ is checked into
-\verb+git-annex+ and the command \verb+git annex sync+ can be issued to sync the
-repository to other \verb+git-annex+ repositories. It must be noted
-here that when the repository is synced, the file content itself is
-not transferred to the other \verb+git-annex+ repositories; only the
-file's name and its meta-data that is stored in a separate git branch
-called \verb+git-annex+ are
+\verb+git-annex+ and the command \verb+git annex sync+ can be issued
+to sync the repository to other \verb+git-annex+ repositories. It must
+be noted here that when the repository is synced, the file content
+itself is not transferred to the other \verb+git-annex+ repositories;
+only the file's name and its meta-data that is stored in a separate
+git branch called \verb+git-annex+ are
transferred\cite{documentation:git-annex-hworks}. In order to create a
copy of a given file in another git annex repository,
\verb+git annex get /path/to/filename.ext+ has to done.
@@ -180,36 +182,36 @@ storage providers. At the time of writing this report,
services:
{\scriptsize
-\begin{itemize}
-\item Amazon S3
-\item Amazon Glacier
-\item Internet Archive via S3
-\item Box.com
-\item Google drive
-\item Google Cloud Storage
-\item Mega.co.nz
-\item SkyDrive
-\item OwnCloud
-\item Flickr
-\item IMAP
-\item Usenet
-\item chef-vault
-\item hubiC
-\item pCloud
-\item ipfs
-\item Ceph
-\item Blackblaze's B2
-\end{itemize}
+ \begin{itemize}
+ \item Amazon S3
+ \item Amazon Glacier
+ \item Internet Archive via S3
+ \item Box.com
+ \item Google drive
+ \item Google Cloud Storage
+ \item Mega.co.nz
+ \item SkyDrive
+ \item OwnCloud
+ \item Flickr
+ \item IMAP
+ \item Usenet
+ \item chef-vault
+ \item hubiC
+ \item pCloud
+ \item ipfs
+ \item Ceph
+ \item Blackblaze's B2
+ \end{itemize}
}
All data pushed to file storage provider's servers can optionally be
encrypted using one's GPG key. For instance, to encrypt data that is
-pushed to the Amazon S3 special remote, the following command is
-used \cite{docs:git-annex-as3}:
+pushed to the Amazon S3 special remote, the following command is used
+\cite{docs:git-annex-as3}:
\begin{verbatim}
$ git annex initremote cloud type=S3 keyid=2512E3C7
-initremote cloud (encryption setup with gpg key C910D9222512E3C7)
+initremote cloud (encryption setup with gpg key C910D9222512E3C7)
(checking bucket) (creating bucket in US) (gpg) ok
$ git annex describe cloud "at Amazon's US datacenter"
describe cloud ok
@@ -222,16 +224,16 @@ size \verb+N+, to do that we do:
\begin{verbatim}
$ git annex initremote cloud type=S3 chunk=1MiB keyid=2512E3C7
-initremote cloud (encryption setup with gpg key C910D9222512E3C7)
+initremote cloud (encryption setup with gpg key C910D9222512E3C7)
(checking bucket) (creating bucket in US) (gpg) ok
$ git annex describe cloud "at Amazon's US datacenter"
describe cloud ok
\end{verbatim}
-Upon completion, each file that has to be pushed to the Amazon S3 special
-remote is divided into 1MiB chunks, each chunk is encrypted using the
-GPG key \verb+2512E3C7+ and the encrypted chunks are finally pushed to
-the Amazon S3 remote. It must be noted here that unlike the Multi
-Cloud Storage Prototype or SkyCDS or combox, in \verb+git-annex+ when
-we are using file chunking all the chunks go to the same location --
-in this case, the Amazon S3 remote.
+Upon completion, each file that has to be pushed to the Amazon S3
+special remote is divided into 1MiB chunks, each chunk is encrypted
+using the GPG key \verb+2512E3C7+ and the encrypted chunks are finally
+pushed to the Amazon S3 remote. It must be noted here that unlike the
+Multi Cloud Storage Prototype or SkyCDS or combox, in \verb+git-annex+
+when we are using file chunking all the chunks go to the same location
+-- in this case, the Amazon S3 remote.