combox-paper

notes and other things concerning combox
git clone git://git.ricketyspace.net/combox-paper.git
Log | Files | Refs

commit d6f6fa899067e364e0e42d17a2d20b3cb3067b75
parent 9497ded90b9e48a08e103fab25ccac8adf5bc4d5
Author: Siddharth Ravikumar <sravik@bgsu.edu>
Date:   Mon,  7 Mar 2016 19:56:33 -0500

Edited Chapter 2.

Diffstat:
report/chapters/2-lit-r.tex | 99++++++++++++++++++++++++++++++++++++++++---------------------------------------
report/combox-report.pdf | 0
2 files changed, 50 insertions(+), 49 deletions(-)

diff --git a/report/chapters/2-lit-r.tex b/report/chapters/2-lit-r.tex @@ -5,15 +5,15 @@ The idea of unifying the storage provided by multiple Internet file storage providers and storing all the content in an encrypted form is -not new, computer researchers/scientists, programmers have devised -different methods to use multiple file storage providers' storage -space. This chapter gives an overview of the work done by Yeo et -al. in unifying the storage provided by Dropbox, Box, Google Drive and -Skydrive on Android devices\cite{yeo}(Section \ref{2-yeo-sec}); -SkyCDS, a content delivery service, by Gonzalez et al., which uses -publish/subscribe overly paradigm and stores the content across -multiple ``cloud'' storage providers such that only part of the -content (in encrypted form) is stored on each ``cloud'' storage +not new, computer researchers and programmers have devised different +methods to use multiple file storage providers' storage space. This +chapter gives an overview of the work done by Yeo et al. in unifying +the storage provided by Dropbox, Box, Google Drive and Skydrive on +Android devices\cite{yeo}(Section \ref{2-yeo-sec}); SkyCDS, a content +delivery service, by Gonzalez et al., which uses publish/subscribe +overlay paradigm and stores the content across multiple ``cloud'' +storage providers such that only part of the content (in encrypted +form) is stored on each ``cloud'' storage provider\cite{skycds}(Section \ref{2-skycds-sec}); lastly, \verb+git-annex+, by Joey Hess\cite{person:joeyh}, that allows one to version control and keep track of large files with a possibility of @@ -33,34 +33,35 @@ phone and the application uses erasure coding\cite{weatherspoon} to split each file into \verb`n + k` fragments and spreads the encrypted fragments across storage provided by the file storage providers. All basic file operations -- Create, Rename, Update, Delete (CRUD) -- are -possible. Information about the file stored in a unified location is -stored in a SQLite database. Unlike combox, which depends the file +possible. Information about the files stored in the unified location +is stored in a SQLite database. Unlike combox, which depends the file storage provider' client to sync file fragments/shards to the file -storage provider's server, the android application developed by Yeo et +storage provider's server, the Android application developed by Yeo et al. takes the responsibility to sync file fragments/shards to each -file storage provider and usesd the OAuth 2.0\cite{protocol:oauth2} +file storage provider and uses the OAuth 2.0\cite{protocol:oauth2} protocol for authorization. -For encrypting file fragments, they use AES-256; they key for -encrypting is derived from the user's password by using Password-Based -Key Derivation Function (PBKDF2)\cite{kaliski}. For erasure coding -they use the JigDFS librarary\cite{jigdfs}. The android application is -able do ``progressive streaming'' of media files; this means that -large media files can be streamed in real-time from the from the file -storage providers' servers; this is an attractive feature in a -``resource contrained'' device where storage is expensive. - -Yeo et al. propose methods for achieving data de-duplication, file -fragment/shard compression based on the type of the file, intelligent -pre-fetching and caching for file fragrments and ``automatic -restoration in exploiting file-versioning''; these features were not -implemented in the prototype Android application and there is -possibility of Yeo et al. implementing these features in the future. - -It becomes that that Yeo et al. work is of immense importance when we -take into consideration the research done by Yang et al., which found -that 59\% of the users who use ``cloud storage service'' access the -service through a smart phone and 42.2\% users access +For encrypting file fragments, they use AES-256; the key for +encrypting file fragments is derived from the user's password by using +Password-Based Key Derivation Function (PBKDF2)\cite{kaliski}. For +erasure coding they use the JigDFS librarary\cite{jigdfs}. The Android +application is able do ``progressive streaming'' of media files; this +means that large media files can be streamed in real-time from the +from the file storage providers' servers; this is an attractive +feature in a ``resource contrained'' device where storage is +expensive. + +Yeo et al. propose methods for achieving data de-duplication; file +compression based on the type of the file; intelligent pre-fetching +and caching of file fragrments and ``automatic restoration in +exploiting file-versioning''; these features were not implemented in +the prototype Android application and there is possibility of Yeo et +al. implementing these features in the future. + +It becomes apparent that Yeo et al. work is of immense importance when +we take into consideration the research done by Yang et al., which +found that 59\% of the users who use ``cloud storage service'' access +the service through a smart phone and 42.2\% users access it for audio/video\cite{yang}. The research by Yang et al. definitely suggests a trend of users' preference for small hand-held computers over laptops and desktops. @@ -76,7 +77,7 @@ minimize loss when a ``cloud'' storage provider goes out of business or if there is temporary outage in the storage service provided by the ``cloud'' storage provider. -In SkyCDS the content delivery to subscribers of the content is +In SkyCDS, the content delivery to subscribers of the content is segregated into two distinct layers -- Metadata Flow Layer and the Content Flow Layer. The publisher of the content largely interacts with the Metadata Flow Layer that controls and keeps track of the what @@ -84,7 +85,7 @@ content is published and the subscriber also largely interacts with the Metadata Flow layer to subscribe to content published in the content delivery system. The Content Flow Layer is where the content is stored across multiple ``cloud'' storage providers. The publisher -is responsible for publishing the content using eth ``delivery +is responsible for publishing the content using the ``delivery workflow'' (part of the Content Flow Layer) and the subscriber uses the ``retrieve workflow'' to get access to the subscribed content. @@ -98,13 +99,13 @@ workflow'' engine which is invoked when the publisher triggers the action to publish the respective content to subscribers. To evaluate the effectiveness of SkyCDS, Gonzalez et al. state that -they've done a case study using the data (content) obtained from -European Space Astronomy Center (ESAC) for the Soil Moisture Ocean -Salinity. In this study, a group of organizations, in two different -continents, used SkyCDS to share satillete images with each -other. According to Gonzalez et al. this study attested SkyCDS as a -viable option for content delivery with respective to performance, -cost of ``cloud'' storage space and reliability. +they've done a case study using the data obtained from European Space +Astronomy Center (ESAC) for the Soil Moisture Ocean Salinity. In this +study, a group of organizations, in two different continents, used +SkyCDS to share satillete images with each other. According to +Gonzalez et al. this study attested SkyCDS as a viable option for +content delivery with respective to performance, cost of ``cloud'' +storage space and reliability. \section{git-annex}\label{2-gitannex-sec} @@ -113,13 +114,13 @@ not usually feasible to version control under \verb+git+\cite{program:git}. \verb+git-annex+, checks in the names and other meta-data about the files in git and stores the actual content under \verb+.git/annex+ directory. When a file is added to -\verb+git-annex+, a symlink of the file is created in place of th file -and the content of the file itself is stored under the +\verb+git-annex+, a symlink of the file is created in place of the +file and the content of the file itself is stored under the \verb+.git/annex+ directory. For instance, say there is a file called -\verb+deb-nicholson-80s.medium.webm+ was downloaded from the Internet -to the \verb+git-annex+ directory: +\verb+deb-nicholson-80s.medium.webm+ that was downloaded from the +Internet to the \verb+git-annex+ directory: \begin{verbatim} ↳ git status @@ -173,8 +174,8 @@ copy of a given file in another git annex repository, \verb+git-annex+ has this feature called ``special remotes''\cite{documentation:git-annex-sremotes}, that allows one to -push/copy data to checked into \verb+git-annex+ to storage provided by -``cloud'' storage providers. At the time of writing this report, +push files checked into \verb+git-annex+ to storage provided by file +storage providers. At the time of writing this report, \verb+git-annex+ supports pushing data to the following file storage services: @@ -201,7 +202,7 @@ services: \end{itemize} } -All data pushed to file storage provider's servers can be optionally +All data pushed to file storage provider's servers can optionally be encrypted using one's GPG key. For instance, to encrypt data that is pushed to the Amazon S3 special remote, following command is used\cite{docs:git-annex-as3}: diff --git a/report/combox-report.pdf b/report/combox-report.pdf Binary files differ.