1025_gitify_your_life.ogv

0:00 - 0:03

Time we start with the next talk
0:04 - 0:06

I welcome Richard Hartmann
0:06 - 0:10

He is involved in Debian since many years
0:10 - 0:14

and he became recently Debian Developer
0:14 - 0:18

and he will talk about gitify your life.
0:18 - 0:22

?, blogs, configs, data and backup. gitify everything
0:22 - 0:24

Richard Hartmann
0:24 - 0:25

Thank you. [applause]
0:31 - 0:32

Thank you for coming
0:32 - 0:35

expecially those who ? years attended all ?
0:37 - 0:39

Short thing about myself
0:40 - 0:42

As ? said I'm Richard Hartmann
0:42 - 0:46

In my day job I am backbone manager at Globalways
0:46 - 0:49

I'm involved in freenode and OFTC and...
0:49 - 0:51

should I speak louder?
0:52 - 0:53

I'm not...
0:56 - 0:58

test, test... good back there?
1:01 - 1:03

Can you turn up the volume a little bit?
1:06 - 1:08

test, test... ok, perfect.
1:08 - 1:13

Since about a week I've been a Debian Developer (yay)
1:13 - 1:21

[applause] and I'm the author of vcsh.
1:21 - 1:25

Raise of hands: who of you know what git is?
1:26 - 1:27

perfect
1:27 - 1:31

That's just as in ? perfect, we can skip it.
1:32 - 1:34

Let's move to the first tool, etckeeper.
1:34 - 1:37

Some of maybe most of this audience will have heard of it,
1:37 - 1:46

it's a tool to basicly store your /etc in pretty much every version control system you can think of
1:46 - 1:48

It's implemented in POSIX shell
1:48 - 1:53

it autocommits every thing in /etc basically at every opportunity
1:53 - 1:55

you may need to write excludes, for example
1:55 - 1:58

before your network config ?
1:58 - 2:00

but else, yeah, that's really cool
2:00 - 2:01

the autocommit
2:01 - 2:07

it hooks into most of the important or maybe even all of the important package management systems
2:07 - 2:11

so when you install your packages, even on SuSE or whatever
2:11 - 2:14

you can just have it commit automatically, which is very nice
2:15 - 2:18

You can obviously commit manually
2:18 - 2:20

if you for example change your X config
2:21 - 2:23

it supports as I said various backends
2:23 - 2:26

it's quite nice to recover from failures
2:26 - 2:31

for example ? used it to recover from saturday's power outages
2:31 - 2:36

because some servers lost stuff and with etckeeper you can just replay all the data which was...
2:36 - 2:38

rather nice.
2:38 - 2:39

Then there is bup.
2:39 - 2:43

bup is a backup tool based on the git pack file format
2:43 - 2:44

it's written in python
2:44 - 2:46

it's very very fast
2:46 - 2:47

and it's very space efficient.
2:48 - 2:53

The author of bup managed to reduce his own personal backup size
2:53 - 2:56

from 120 GiB to 45 GiB
2:56 - 3:00

just by migrating away from rsnapshot over to bup
3:00 - 3:02

which is quite good
3:02 - 3:05

I mean, it's almost or a little bit more than a third, so
3:05 - 3:06

very good
3:07 - 3:10

This happens because it has built-in deduplication
3:10 - 3:14

because obviously git pack files also have deduplication
3:15 - 3:17

You can restore every single mount point
3:17 - 3:19

or every single point in time
3:19 - 3:23

every single backup can be monted as FUSE filesystem or a ? filesystem
3:23 - 3:25

independently of each other
3:25 - 3:28

so you can even compare different versions of what you have in your backups
3:29 - 3:30

which again is very nice
3:30 - 3:35

the one thing which is a real downside for serious deployments
3:36 - 3:43

there is no way to get data out of your... archive or out of your backups
3:43 - 3:47

which again is a direct consequence of using git pack files
3:47 - 3:50

there is a branch which supports deleting old data
3:50 - 3:53

but this is not in mainline and it hasn't been in mainline for...
3:54 - 3:56

I think one or two years
3:56 - 3:59

so I'm not sure if it will ever happen but...
3:59 - 4:01

yeah
4:01 - 4:03

at least in theory it would exist.
4:04 - 4:10

Then for your websites, for your wikis, for your whatever there is ikiwiki.
4:11 - 4:13

ikiwiki is a wiki compiler,
4:13 - 4:14

as the name implies,
4:14 - 4:19

and it converts various different files into HTML files
4:20 - 4:21

it's written in Perl
4:21 - 4:23

it supports various backends
4:23 - 4:25

again most of the ones you can possibly think of
4:27 - 4:29

oh, I can even slow down, good
4:30 - 4:35

it's able to parse various markup languages, more on that on the next slide
4:35 - 4:42

there are several different ways to actually edit any kind of content within ikiwiki
4:43 - 4:46

it has templating support, it has CSS support
4:46 - 4:52

these are quite extensive, but they may be improved, but that's for another time
4:52 - 4:57

it acts as a wiki, as a CMS, as a blog, as a lot of different things
4:57 - 5:03

it automatically generates RSS and Atom feeds for every single page, for every subdirectory
5:03 - 5:06

so you can easily subscribe to topical content
5:06 - 5:10

if you are for example only interested in one part of a particular page
5:10 - 5:12

just subscribe to this part by RSS
5:12 - 5:15

and you don't have to check if there updates for it
5:15 - 5:20

which is very convenient to keep track of comments somewhere or something
5:20 - 5:26

It supports OpenID, which means you dont have to go through all the trouble of...
5:26 - 5:29

having a user database or doing very...
5:30 - 5:31

or doing a lot of antispam measures
5:31 - 5:35

because it turns out OpenID is relatively well...
5:35 - 5:36

suited for just...
5:36 - 5:39

stopping spam. For some reason, maybe they just
5:39 - 5:41

haven't picked it up yet, I don't know
5:41 - 5:44

but it's quite nice, because you don't have to do any actual work
5:44 - 5:50

and people can still edit your content, and you can track back changes at least to some extent
5:52 - 5:58

it supports various markup languages, the best ones, well, debatable, but in my opinion is Markdown
5:58 - 6:07

it supports WikiText, reStructuredText, Textile and HTML and there are ikiwiki specific extensions
6:07 - 6:12

for example normal wikilinks which are a lot more powerful than the normal linking style in MarkDown
6:12 - 6:15

which kind of sucks, but... whatever
6:17 - 6:23

it also supports directives, which basically tell ikiwiki to do special things with the page
6:23 - 6:24

for example you can tag your blog pages
6:24 - 6:27

or you can make...
6:27 - 6:33

generate pages which automatically pull in content from different other pages and stuff like this.
6:33 - 6:35

that's all done by directives.
6:38 - 6:40

How does it work?
6:40 - 6:45

You can edit webpages directly, if you want to, on the web
6:45 - 6:50

then you will have a rebuild of the content
6:50 - 6:52

but only the parts with changes
6:52 - 6:56

so if you... hello?
6:56 - 6:59

if you change only one single file it will only rebuild one single file
6:59 - 7:04

if you change for example the navigation it will rebuild everything because obviously...
7:04 - 7:06

it is used by everything.
7:16 - 7:20

If it has to generate pages automatically, for example the index pages or something
7:20 - 7:23

if you just create a new subdirectory, or if you have...
7:24 - 7:26

if you have commands which have to appear on your site
7:26 - 7:29

it will automatically generate those MarkDown files and commit them
7:29 - 7:34

or you put them in your souce directory and you just commit them and...
7:34 - 7:38

and have them part of your site, or you can autocommit them if you want.
7:38 - 7:40

That's possible as well.
7:40 - 7:46

You can obviously change... pull in changes in your local repository if you want to look at them
7:47 - 7:49

Common uses would be public wiki...
7:49 - 7:53

private notes, for just note keeping of your personal TODO list or whatever
7:54 - 7:58

having an actual blog, which a lot of people in this room probably do
7:58 - 8:04

that's, yeah, I mean a lot of people on Planet Debian have their blog on ikiwiki, for good reasons
8:05 - 8:09

and an actual CMS for company websites or stuff
8:09 - 8:12

which also tends to work quite well.
8:14 - 8:22

The three main ways to interact with ikiwiki are webbased text editing, which is quite useful for new users, but is quite boring, in my opinion,
8:22 - 8:28

there is also a WYSIWYG editor which is even more fancy for non-technical users
8:29 - 8:33

and there is just plain old CLI-based editing way:
8:33 - 8:39

just edit files and commit them back into repository pushes up and everything gets rebuilt automatically , which is...
8:39 - 8:42

in my opinion the best way to interact with ikiwiki, because
8:42 - 8:46

you are able to stay on the command line and simply push out your...
8:46 - 8:50

your stuff onto the web and you don'tactually have to leave the command line
8:51 - 8:53

which is pretty kinda neat.
8:54 - 8:57

There are also some more advanced use cases
8:59 - 9:03

as I said you can interface with the source files directly
9:03 - 9:04

you can maintain...
9:05 - 9:06

something is wrong
9:06 - 9:10

for example you can maintain your your wiki and your docs and your...
9:10 - 9:12

source code in one single directory
9:13 - 9:15

and it would simply...
9:15 - 9:19

and simply have parts of your subdirectory structure rendered.
9:19 - 9:21

for example git-annex does this
9:21 - 9:24

there is a doc directory, which is rendered to the website
9:25 - 9:27

but is also part of the normal source directory
9:27 - 9:31

which means that everybody who checks out a copy of the repository
9:31 - 9:34

will have the complete forum, bug reports, TODO lists
9:34 - 9:35

user comments,
9:35 - 9:40

everything on their local filesystem, without having to leave - again - their command line,
9:40 - 9:49

which doesn't break media, and so is just very convenient to have one single resource for everything regarding one single program.
9:50 - 9:53

And another nice thing is if you create different branches
9:53 - 9:59

for preview, staging areas you can have workflows where some people are just allowed to create ...
9:59 - 10:05

pages, other people then look over those pages and merge them back into master and then push them on the website
10:05 - 10:08

which basically allows you to...
10:09 - 10:14

to have content control or real publishing workflow, if you have a need to do this
10:16 - 10:18

Next stop: git-annex.
10:19 - 10:20

The beef.
10:22 - 10:29

It's basically a tool to manage files with git without checking those files into git
10:30 - 10:32

?
10:35 - 10:36

Yeah, what is git-annex?
10:36 - 10:36

It's based on git,
10:36 - 10:39

it maintains the metadata about files,
10:39 - 10:43

as in location, and file names and everything, in your git repository
10:44 - 10:49

but it doesn't actually maintain the file content within the git repository
10:49 - 10:50

more on that later
10:50 - 10:53

this saves a lot of time and space.
10:54 - 10:59

You still able to use any git-annex repository as a normal git repository
10:59 - 11:02

which ? means you're even able to have a mix of...
11:02 - 11:05

for example, say, all your ? files
11:05 - 11:08

should be maintained by normal git,
11:08 - 11:12

and then you have all the merging which git does for you and everything
11:12 - 11:14

and then you have for example your photographs,
11:14 - 11:16

or your videos for web publishing
11:16 - 11:19

which are maintained in the annex
11:19 - 11:24

which means you don't have to have a copy of those files in each and every single location
11:26 - 11:31

A very nice thing about git-annex is that it's written with very low bandwidth and flaky connections in mind
11:32 - 11:36

quite a lot of you will know that Joey lives basically in the middle of nowhere
11:36 - 11:40

which is a great thing to be forced to write really efficient code
11:41 - 11:43

which doesn't use a lot of data, and that shows:
11:44 - 11:44

it's really quick
11:44 - 11:48

and even if you had a really really bad connection
11:48 - 11:50

in backwaters or whatever...
11:50 - 11:52

during holidays or during normal living
11:53 - 11:56

it's still able to transfer the data which you need to transfer,
11:56 - 11:58

it's very very nice
11:58 - 12:02

There are various workflows: we'll see four of them in a few minutes
12:04 - 12:09

So. It's written in Haskell, so it's probably strongly typed and nobody can write patches for it
12:11 - 12:14

it uses rsync to actually transfer the data,
12:14 - 12:17

which means it doesn't try to reinvent any wheels
12:17 - 12:24

it's really just based on top of established and well know and well debugged programs
12:24 - 12:29

In indirect mode, which in my personal opinion is the better mode,
12:29 - 12:30

what it does is
12:30 - 12:36

it moves the actual files into a different location, namely .git/annex/objects
12:37 - 12:42

it then makes those files read only, so you cannot event accidentally delete those files
12:42 - 12:47

even if you rm -f them, it will still tell you no, I can't delete them,
12:47 - 12:48

which is very secure
12:49 - 12:52

may be incovenient, but you can work on this
12:52 - 12:57

it replaces those files with symlinks of the same name, and those just point at the object
12:57 - 13:00

and if there is an object behind this symlink or not...
13:00 - 13:06

that basically returns whether you are able on this particular machine, or in this particular repository
13:07 - 13:13

but you will definitely have the informations about the name of the file, the theorethical location of the file...
13:13 - 13:17

the hash of the file will be in every single repository
13:17 - 13:19

There is also a direct mode
13:19 - 13:22

initially mainly written for windows and Mac OS X
13:22 - 13:25

because Windows just doesn't support symlinks properly
13:25 - 13:28

and OS X was supporting symlinks,
13:28 - 13:32

apparently has lots of developers who think it is a great idea to follow symlinks...
13:32 - 13:35

and display the actual target of the symlink instead of the symlink
13:35 - 13:39

so you have cryptic filenames which are very hard to deal with
13:39 - 13:46

obviously people who are used to GUI tools which then only display really really cryptic names ?
13:46 - 13:50

so there is direct mode which doesn't do the symlink stuff
13:50 - 13:53

it basically rewrites the files on the fly
13:53 - 13:58

git still thinks it would be managing symlinks, but...
13:58 - 14:04

git-annex just pulls them up from under git, and pushes in the actual content.
14:05 - 14:09

You keep on nodding, so... I'm probably doing good
14:10 - 14:14

and if you want you can always delete old data, or you can keep it...
14:14 - 14:17

or you can just... for example what I'm doing:
14:17 - 14:20

you can have one or two machines which slurp up all your data...
14:20 - 14:26

and have an everlasting archive of everything which you've ever put into your annexes...
14:26 - 14:30

and other machines, for example laptops with smaller SSDs
14:31 - 14:34

those just have the data which you are actually interested in at the moment
14:36 - 14:38

How does this work in the background?
14:38 - 14:41

Each repository has a UUID
14:41 - 14:46

It also has a name, which makes it easier for you to actually interact with the repository...
14:46 - 14:49

but in the background it's just the UUID for obvious reasons...
14:49 - 14:55

because it just makes ? and synchronization easy, period
14:55 - 14:59

It's also tracking informations in a special branch called git-annex
14:59 - 15:03

this branch means that all...
15:06 - 15:11

this branch ? every single repository has full and complete informations...
15:11 - 15:16

about all files, about the locations of all files, about the last status of those files...
15:16 - 15:19

if those files have been added to some repository
15:19 - 15:19

or they have been deleted,
15:19 - 15:22

or if they are being over there forever
15:22 - 15:31

so in every single repository you can just lookup the status of this file or of all files in all of your repositories
15:31 - 15:33

which is, yeah, convenient
15:34 - 15:38

The tracking information is very simple
15:38 - 15:41

and it's designed to be merged very...
15:41 - 15:43

it's a little bit more complicated than applying union merge,
15:43 - 15:46

but basically what it does is it adds a timestamp
15:47 - 15:53

and tells if the file is there or not and it has the UUID of the repository
15:53 - 15:57

and from this informations, along with the timestamps you can simply reproduce...
15:57 - 16:04

the whole lifecycle of your files through your whole cloud of git-annex repositories
16:04 - 16:06

in this one particular annex.
16:07 - 16:10

One really nice which you can do is...
16:10 - 16:13

if you are on the command line, which again in my opinion is the better mode...
16:13 - 16:15

you can simply run git-annex sync
16:15 - 16:17

which basically does a commit...
16:17 - 16:20

oh, it does a git-annex add, then it does a commit,
16:20 - 16:24

then it merges from the other repositories
16:24 - 16:27

into your own master, into your own git-annex branch
16:27 - 16:29

then it merges the log files
16:29 - 16:31

that's where the git-annex branch comes in
16:31 - 16:34

and then it pushes to all other known repositories
16:34 - 16:42

which is basically a one-shot command to syncronize all the metadata about all the files with all the other repositories
16:43 - 16:45

and it takes no time at all
16:45 - 16:47

given a network connection
16:48 - 16:52

Data integrity is something which is very important for...
16:52 - 16:57

yeah, for all of the tools, but git-annex was really designed with data integrity in mind
16:58 - 17:04

by default it uses a SHA-2 256 with file extension...
17:04 - 17:08

to store the objects, so it renames the file to its own shasum
17:08 - 17:13

which allows you to always verify the data even without git-annex
17:13 - 17:17

you are able to say by means of globbing...
17:17 - 17:22

which files, or which directory, or which types of files should have how many copies in different repositories
17:22 - 17:24

so for example what I do:
17:24 - 17:28

all my raw files, all theraw photographs are in at least three different locations,
17:28 - 17:32

all the JPEGs are only in two, because JPEGs can be regenerated
17:32 - 17:33

raws can not.
17:34 - 17:38

All remotes and all special remotes can always be verified
17:38 - 17:41

with special remotes this may take quite some bandwidth
17:41 - 17:46

with actual normal git-annex remotes you run the verification locally
17:46 - 17:52

and just report back the results with obviously saves a lot of bandwidth and transfer time
17:54 - 17:58

verification obviously takes the amount of requires copies into account
17:58 - 18:01

so if you would have to have 3 different copies
18:01 - 18:05

and your whole repository cloud only has 2, it will complain
18:05 - 18:09

it will tell you "yes, checksum is great, but you don't have enough copies, please do something about it".
18:11 - 18:15

and even if you ? right now, delete all copies from git annex
18:15 - 18:19

you would still be able to get all your data out of git annex
18:19 - 18:24

because what it boils down to, in indirect mode, it's just symlinks to other objects
18:24 - 18:28

these objects have their own checksum as their file name
18:28 - 18:31

so you'll even be able to verify, without git-annex,
18:31 - 18:33

just by means of a little bit of shell scripting,
18:33 - 18:35

that all your files are correct,
18:35 - 18:39

that you don't have any bit flips or anything on your local disk.
18:40 - 18:44

direct mode doesn't really need a recovery ?, because...
18:45 - 18:48

the actual file is just in place of the symlink
18:52 - 18:55

but on the other hand you won't be...
18:55 - 18:59

you still need to look at the git-annex branch to determine the actual checksums
18:59 - 19:02

which you wouldn't have to do with the indirect mode.
19:03 - 19:08

There are a lot of special remotes. And what are special remotes?
19:08 - 19:11

these are able to store data in non git-annex remotes
19:11 - 19:16

because, let's face it, on most servers, or most servers where you could store data
19:16 - 19:19

you aren't actually able to get a shell and execute commands
19:19 - 19:22

you can just push data to it, you can receive data
19:22 - 19:25

but you cannot actually execute anything on this computer.
19:27 - 19:29

That's what special remotes are for.
19:30 - 19:34

All special remotes support encrypted data storage
19:34 - 19:37

so you just gpg encrypt your data and then send it off
19:37 - 19:42

which means that the remotes can only see the file names
19:42 - 19:46

but they cannot see anything else about the contents of your files
19:46 - 19:52

obviously you don't want to trust amazon or anyone to store your plain text data
19:52 - 19:54

that would just be stupid
19:54 - 19:59

There is a hook system, which allows you to write a lot of new special remotes
19:59 - 20:06

you'll see a list of... quite an extensive list of stuff in a second
20:06 - 20:11

Normal, built-in, special remotes which are supported by haskell out of the box
20:11 - 20:13

by git-annex out of the box
20:13 - 20:15

and actually implemented in haskell
20:16 - 20:22

are Amazon Glacier, Amazion S3, bup, directory — a normal directory on your system
20:22 - 20:27

rsync, webdav, http or ftp and the hook system
20:28 - 20:32

there is a guy who brought most of those
20:32 - 20:37

we can support archive.org, IMAP, box.com, Google Drive... you can read them yourself, I mean...
20:37 - 20:41

but those are quite a lot of different special remotes, if you...
20:41 - 20:49

already have storage on any of those services, just start pushing encrypted data to it if you want, and you're basically done.
20:52 - 20:55

There is an ongoing project called the git-annex assistant
20:55 - 20:59

last year, and I think this year it just ended, didn't it?
21:00 - 21:05

so, pretty much exactly one year ago Joey has started to to raise funds
21:05 - 21:12

by means of a kickstarter to just focus on writing git-annex assistant for a few months
21:13 - 21:15

he got so much that he could do it for a whole year
21:15 - 21:23

and he's just restarted the whole thing with his own fundraising campaign without kickstarter and he got another full year
21:24 - 21:33

yeah... are you still accepting funds?
21:34 - 21:38

ok, so, if you use it at least consider donating
21:38 - 21:44

because honesty you can't write patches for it anyway, because it's in haskell, so...
21:44 - 21:49

that's... the other means of actually contributing
21:53 - 21:57

git-annex boils down to be a daemon, which runs in the background
21:58 - 22:03

and keeps track of all of your files, of newly added files
22:03 - 22:09

and then starts transferring those files, if configured to do so
22:09 - 22:15

it starts transferring files to other people or to other repositories
22:15 - 22:18

this is all managed by means of a web gui
22:18 - 22:26

which in turns means that it's really, well, not easy, but easier to port to for example windows or android
22:26 - 22:28

which both work, to some extent
22:29 - 22:33

not fully, but they are useful, or useable, more or less
22:34 - 22:40

at least on android it's really quite good, I couldn't test it on windows, because...
22:41 - 22:45

and it also makes it accessible for non technical users
22:45 - 22:50

so for example if you want to share some of your photographs with your parents
22:50 - 22:54

or with friends, or if you want to share, I don't know, videos with other people
22:54 - 22:57

you just put them into one of those repositories
22:57 - 23:02

and even those non-technical people just magically see stuff appear in their own repository
23:02 - 23:04

and can just pull the data if they want to
23:04 - 23:08

or if you configured it to do so, it would it would even transfer all the data automatically
23:09 - 23:13

which is... it's ?
23:15 - 23:20

It supports content notifications, but not content transfer
23:20 - 23:22

by means of xmpp or jabber
23:22 - 23:27

which used to work quite well with google talk, I think it's not...
23:28 - 23:29

oh, it still works, ok
23:30 - 23:37

at least at the moment, we'll see when they just ? google ? with google+, but...
23:38 - 23:43

at least at the moment it still works, if you have a google account you can simply transfer all your data
23:43 - 23:49

you can transfer the metadata about your data, you cannot actually transfer the files through jabber
23:49 - 23:54

but that's probably something which will happen within the next year
23:55 - 23:58

there are quite ? rulesets for content distribution
23:58 - 24:04

so for example I can show you...
24:04 - 24:11

you can say "put all raw files into this archive, and all jpegs on my laptop", or whatever
24:11 - 24:16

or "if I still have more than 500 GB free on this please put data in
24:16 - 24:21

and as soon as only have 20 left stop putting data into this one repository"
24:21 - 24:24

which obviously is quite convenient
24:24 - 24:28

as I said there is a windows port, and now on to usecases.
24:28 - 24:30

First usecase: the archivist.
24:31 - 24:34

What the archivist does is: basically he just collects data
24:34 - 24:38

either to ? or just to collect
24:38 - 24:43

and if you have this usecase what you probably want to do, you want to have offline disks
24:43 - 24:47

to store at your mom's, or to put into a drawer
24:47 - 24:53

or just you don't have enough sata ports in your computer because you just have so much data
24:53 - 25:00

so, what you can do is you can just push this data to either connected machines or to disconnected drives...
25:00 - 25:02

or to some webservice, and just store data
25:02 - 25:06

but normally you would have the problem of keeping track of where your data lives
25:06 - 25:09

if it's still ok, if it's still there, everything.
25:09 - 25:16

With git-annex you can automate all this administrative side of archiving your stuff.
25:17 - 25:22

Even if you only have one of those disks, if they're a proper remote...
25:22 - 25:27

you'll have full informations about all the data in your annex cloud up to this point
25:27 - 25:33

so even if you only pull out one random disk you still have informations on all the other disks on this one disk
25:33 - 25:36

which obviously is a nice thing.
25:37 - 25:38

Media consumption.
25:38 - 25:45

Let's say you pull a video of this talk, or you get some slides...
25:45 - 25:48

maybe also from this talk, you can get some podcasts...
25:48 - 25:53

and git-annex has become a native podcatcher quite recently, I thing two or three weeks ago
25:53 - 25:56

which means you don't even have a separate podcatcher
25:57 - 26:02

you just tell git-annex "this is all of my rss feeds" and it will just pull in all the content,
26:03 - 26:08

Then you can synchronize all this data for example to your cellphone, or your tablet, or whatever
26:08 - 26:14

consume the data on any of your devices, even if you have 10 copies of this particular podcast
26:14 - 26:17

because you didn't get around to listen to it on your computer...
26:17 - 26:20

and you didn't get around to listen to it on your cellphone
26:20 - 26:22

but then on your tablet you did listen to it
26:22 - 26:25

you have three copies of this file which you don't need anymore...
26:25 - 26:28

because you have listened to the content and you don't care about the content anymore
26:28 - 26:34

what you do is you drop this content on one random repository
26:34 - 26:38

and this information that you have dropped the actual content,
26:38 - 26:42

not the metadata about the content, but the actual content, you don't need the content anymore...
26:42 - 26:47

will slowly propagate to all of the annexes and if they have the data they will also drop the data
26:47 - 26:53

so you don't have to really care about keeping track of those things
26:53 - 26:56

you can simply have this message propagate
26:57 - 27:01

do you want to comment? can someone give Joey a microphone?
27:07 - 27:10

Just a minor correction
27:10 - 27:12

it doesn't propagate that you've dropped the content
27:12 - 27:15

but you can move it around in ways that have exactly the effect you described
27:16 - 27:22

? get the wrong idea that if you accidentally remove one thing it will vanish from everything ?
27:23 - 27:26

but if you deliberately drop the content and tell the annex...
27:26 - 27:28

no. that's not how it works.
27:28 - 27:30

I want to talk about it later, but it's...
27:30 - 27:32

you looked at the slides, but...
27:32 - 27:33

sorry, ?
27:35 - 27:37

He watches for everything which is ?
27:47 - 27:55

Next thing, if you are on the road, and one usecase which is probably quite common: taking pictures while you are on the road ?
27:55 - 27:58

You take your pictures, you save them to your annex
27:58 - 28:01

where you are able to store them back to your server or wherever
28:01 - 28:07

if you want to, and even if for example one disk gets ?
28:07 - 28:09

and you lose part of your content,
28:09 - 28:14

you'll still at least be able to have an overview of what content used to be in your annex
28:14 - 28:21

and if you then pull out your old SD card and see "oh, that photo is still there" you can simply reimport it and it will magically reappear.
28:21 - 28:22

What it also does is:
28:22 - 28:24

if you have a very tiny computer with you
28:24 - 28:29

you can, as soon as you are at an internet cafe, just sync up with your server or your storage, whatever
28:29 - 28:34

and push out the data to your remotes
28:34 - 28:39

which then means you'll have two or three or five copies of the data
28:39 - 28:41

and git-annex keeps track of what is where for you
28:41 - 28:45

so you don't have to worry about copying stuff around.
28:48 - 28:51

And then there is one personal usecase, for photographs
28:52 - 28:56

I have a very specific way of organizing my photographs
28:56 - 28:58

my wife disagrees violently
29:00 - 29:03

she likes to do her photo storage in a completely different way
29:03 - 29:05

she doesn't care about the raw files
29:05 - 29:12

and she doesn't care about all the documentation pictures of signposts or whatever which I just took to remember which cities we went through
29:12 - 29:19

so what she can do is she can simply delete the actual files or ? the symlink of this file
29:19 - 29:22

and it will disappear from her own annex
29:22 - 29:24

she can then commit all this
29:24 - 29:30

normally if she would sync back the data I would also have the same layout, which I don't want
29:30 - 29:34

expecially since she tends to rename everything a lot
29:34 - 29:39

but what I did, I set up a rebasing branch on top of my normal git-annex repository
29:39 - 29:43

so what she gets is: she has her own view of the whole data
29:43 - 29:45

or the part she cares about
29:45 - 29:47

and when I add new content
29:47 - 29:51

she will see the new content, she will rearrange the content however she pleases
29:51 - 29:53

but as it's a rebasing branch
29:53 - 29:56

all her changes will always be replayed on top of master
29:59 - 30:02

so she has her own view, and I don't even notice her own view
30:02 - 30:08

but even if she uses one of the other computers she would have the same view which she herself has
30:08 - 30:12

so basically she has her own view all of the data
30:12 - 30:15

This is very convenient to keep the peace at home.
30:17 - 30:19

Next topic: vcsh.
30:20 - 30:23

Most of you here probably have some sort of system...
30:23 - 30:27

where they have one subversion or cvs or whatever repository
30:27 - 30:30

and they have it somewhere in their home directory
30:30 - 30:36

you symlink into various places in your home directory, and it kind of keeps working so you don't throw it away, but...
30:36 - 30:39

to be honest it sucks. Here is why.
30:41 - 30:43

Or, here's why in a second.
30:44 - 30:47

vcsh is implemented in POSIX, which is very very portable
30:47 - 30:52

it's based on git, but it's not directly git
30:52 - 30:57

The one thing which git is not able to do is maintain several different working copies into one dicrrectory
30:57 - 31:00

which is a safety feature, more on that later
31:00 - 31:06

but this really sucks if you want to maintain your mplayer, your shell, your whatever configuration
31:06 - 31:11

in your home directory, which is the obvious and only real place where it makes sense to put your configuration
31:11 - 31:14

you don't want to put it into dot-dot-files and then symlink back
31:14 - 31:18

you want to have it in your home directory as actual files.
31:18 - 31:21

So, vcsh uses fake bare git repositories
31:21 - 31:23

again, more on that on the next slide
31:23 - 31:25

and it's basically a wrapper around git
31:25 - 31:31

which makes git do stuff which it normally wouldn't do
31:31 - 31:36

and it has a quite extensible and useful hook system which ? will care about
31:37 - 31:42

Whith a normal git repository you have two really defining variables within git
31:42 - 31:44

you have the work tree
31:44 - 31:46

which is where your actual files live
31:47 - 31:51

and you have the $GIT_DIR, where the actual data lives
31:51 - 31:56

normaly in a normal checkout you just have your directory and .git under this
31:57 - 32:02

If you have a bare repository you obviously don't have an actual checkout of your data
32:02 - 32:06

you have just all the objects and the configuration stuff
32:06 - 32:09

so that's what a bare repository boils down to being
32:10 - 32:13

A fake bare git repository on the other hand has both
32:13 - 32:15

it has a $GIT_WORK_TREE and it has a $GIT_DIR
32:15 - 32:17

but those are detached from each other
32:17 - 32:20

they don't have to be closely tied together
32:20 - 32:26

and also sets core.bare = false, to actually tell git that "yes, this is a real setup, but..."
32:26 - 32:31

"yes, you still have a work tree, even thought you don't really expect it"
32:31 - 32:33

"to have one, you still have a work tree".
32:35 - 32:38

By default vcsh puts your work tree into home
32:38 - 32:40

and your git dir into...
32:40 - 32:45

it's based on .config/vcsh/repo.d and then the name of the repository
32:45 - 32:50

which just puts it away and out the way of you actually seeing stuff
32:50 - 32:55

but it follows the cross desktop specifications so if you move stuff around it will also follow
32:55 - 32:57

Fake bare repositories are really...
32:58 - 33:02

are messy to setup, and it's very easy to get them wrong
33:02 - 33:07

that is also the reason why git normally disallows this kind of stuff
33:08 - 33:10

because all of a sudden you have a lot of...
33:10 - 33:13

context-dependency on when you do what
33:13 - 33:15

just immagine you set git workdir...
33:15 - 33:16

$GIT_WORK_DIR, sorry
33:16 - 33:20

and run random commands like git add, that's...
33:20 - 33:26

kind of ok, if you git reset --hard you'll probably not be to happy
33:26 - 33:29

you checkout the current version that's also quite bad
33:29 - 33:32

and if you clean -f, yeah, you just throw the home directory
33:32 - 33:34

congratulations
33:34 - 33:39

So, it's really risky to run with these variables set
33:39 - 33:44

which is why I wrote vcsh to wrap around git
33:44 - 33:50

to hide all this complexity and do quite some sanity checks to make sure everything's set up correctly
33:50 - 33:57

again it allows you to have several repositories and it also manages really the complete lifecycle of all your repositories
33:57 - 34:03

it's very easy to just create a new repository, you just init, just with git
34:03 - 34:08

you add stuff, you commit it, and you define a remote and start pushing to this remote
34:09 - 34:10

simple
34:11 - 34:14

This looks like git because it's very closely tight to git
34:14 - 34:19

and it uses a lot of the power or of the syntax of git, for obvious reasons
34:19 - 34:22

because... it's closely tight to git
34:22 - 34:25

you can simply clone as you would with git
34:25 - 34:28

you can simply show your files as you would with git
34:28 - 34:32

you can rename the repository, which git can't do, but you don't have to
34:32 - 34:34

you can show the status of all your files
34:34 - 34:36

or just of one of your repositories
34:36 - 34:38

or of all repositories
34:38 - 34:44

you can pull in all your repositories at once, you can push all of your repositories at once
34:44 - 34:46

with one single command
34:47 - 34:52

so, if you are on the road, or you just want to sync up a new machine it's really quick, it's really easy
34:53 - 34:57

There are three modes of dealing with your repositories
34:57 - 34:59

default mode is the quickest to type
34:59 - 35:04

you just say vcsh zsh commit whatever or any random git command
35:05 - 35:06

but you cannot really run gitk
35:06 - 35:10

you can do this by using the run mode, which is the second mode
35:10 - 35:14

we simply ? here run is missing and here git is missing
35:14 - 35:19

so you say simply vcsh run zsh git commit whatever
35:19 - 35:26

and this is exactly the same command, it's literally the same comand once it arrives at the shell level, so to speak
35:26 - 35:29

here you can also run gitk, because...
35:29 - 35:34

with this, you set up the whole environment for one single command to run with this context
35:34 - 35:37

of the changed environment variables
35:37 - 35:42

or you could even enter the repository, then you set all the variables
35:42 - 35:46

and then you can just use normal git commands as you would normally
35:46 - 35:48

this is the most powerful mode,
35:48 - 35:52

but it's also the most likely to hurt you if you don't know what you're doing
35:52 - 35:55

so I don't recommend working ? down this way.
35:57 - 36:04

You should have your shell display prompt information about being in a vcsh repository or not
36:04 - 36:08

simply because else you may forget that you entered something
36:08 - 36:14

and then if you run those commands, there will be pain
36:18 - 36:22

At once the usecases, which will be possible quite soon
36:22 - 36:29

we can just combine vcsh with git-annex to manage everything which is not configuration files in your own home directory
36:29 - 36:35

? basically two programs to sync everything about all of your home directory
36:35 - 36:37

without having to do any extra work
36:38 - 36:41

you can also use it to do really wierd stuff
36:41 - 36:46

for example you can backup a .git of a different repository with the help of vcsh
36:46 - 36:52

so you can just go in, change objects or anything, break stuff and just replay whatever you're doing
36:52 - 36:56

just to try and see how it breaks in interesting ways.
36:56 - 37:02

You can just backup a working copy which is maintained by a different reopository or a different system
37:02 - 37:07

you can even put a whole repository, including the .git,
37:07 - 37:08

into a different git file
37:08 - 37:13

or you can even put other VCSs like subversion or something into git, if you want to.
37:14 - 37:16

Then there is mr.
37:16 - 37:18

mr ties all those...
37:18 - 37:23

hopefully by now you have about twenty new repositories
37:23 - 37:26

because you have configuration, you have ikiwiki, you have everything
37:26 - 37:29

so now you need something to syncronize all those repositories
37:29 - 37:32

because doing it by hand is just a lot of work
37:35 - 37:41

mr supports push, pull, commit operations for all the major known version control systems
37:41 - 37:45

allowing you to have one single interface to operate on all your systems
37:45 - 37:49

It's quite trivial to write support for new systems
37:49 - 37:52

I think it took me about two hours to support vcsh natively
37:52 - 37:54

so, that's really quick
37:54 - 37:57

If you want to try, the stuff which I told you about...
37:57 - 38:05

in the links later there will be the possibility to just clone a subrepository for vcsh
38:05 - 38:10

which will then put up a suggested mr directory layout
38:10 - 38:12

and you can just work from there
38:12 - 38:16

This is the... alright, it's my suggested layout
38:16 - 38:18

which basically...
38:18 - 38:22

you just include everything in config.d you maintain...
38:22 - 38:30

your available.d, by means of vcsh, so you simply sync around all your content between all the different computers
38:30 - 38:35

and then you simply soft link from available to the actual config
38:35 - 38:39

which is basically what apache does with sites.enabled and sites.available
38:39 - 38:43

or modules.available and modules.enabled
38:43 - 38:45

which is really really powerful
38:45 - 38:48

Last thing is not git based, but zsh.
38:49 - 38:52

It's a really powerful shell, you should consider using it
38:52 - 38:56

it has very good tab complection for all the tools listed here, more than bash
38:56 - 39:00

it has a right prompt, which will automatically disappear if it needs to
39:00 - 39:05

which is very convenient to display not important but still useful information
39:05 - 39:11

and it will automatically, if you tell it to, tell you about you being in a git repository or subversion repository or whatever
39:11 - 39:12

by means of vcs.info
39:13 - 39:18

which also means you'll be told that at the moment you are in a vcsh repository
39:18 - 39:21

and you may kill your stuff if you do things wrong
39:21 - 39:23

it can mimic all the major shells
39:23 - 39:26

and there's just too many reasons to list
39:27 - 39:29

So... final pitch
39:29 - 39:34

This is true: I've tried it earlier, I can demo it, I still have five minutes left
39:34 - 39:39

it takes me less than five minutes to syncronize my complete, whole, digital life while on the road
39:39 - 39:44

so if I'm at the airport and just want to update all my stuff,and push out all my stuff...
39:44 - 39:47

it'll take a few minutes, but then I can hop on the airplane...
39:47 - 39:51

and I'll know everything is fine, everything is up-to-date on my local machine
39:51 - 39:57

on my laptop machine, I can continue working, and have a backup on my remote systems
39:57 - 39:59

These are the websites
40:00 - 40:08

The slides will be linked from penta, so you are more than welcome to look at these links later
40:08 - 40:12

There are previous talks, which you can also look at, if you want to
40:12 - 40:14

and that's pretty much it
40:14 - 40:17

and if you have any more questions afterwards either catch me...
40:17 - 40:21

or there is an IRC channel, and there is a mailing list
40:21 - 40:27

ok, we can take a few questions, we have still a few minutes
40:27 - 40:31

then if there are more questions ask Ritchie afterwards
40:32 - 40:36

And while we are doing this just look here, because that's a complete sync of everything I have
40:37 - 40:40

Just to make sure I understood this correctly,
40:40 - 40:49

with git-annex the point is that the data is stored dispersed over different local destinations, so to speak
40:49 - 40:53

but the metadata ? exists, ? complete git history
40:53 - 41:00

so git is able to tell me, "well, this version at that destination was changed at that time and so on and so on"
41:00 - 41:03

did I get this right or...
41:03 - 41:05

git will be able to tell you about changes...
41:05 - 41:08

ok, I don't have internet, sorry
41:08 - 41:12

git will be able to tell you about changes in the filenames, or directory structure
41:12 - 41:16

git-annex will be able to tell you about changes in the actual file content
41:16 - 41:17

or in moving around the files
41:17 - 41:22

but as one single unit, more or less, then yes...
41:22 - 41:25

the answer is yes, but not quite, but yes
41:25 - 41:32

yes, but ? all the things you asked about are in git, you know the previous location, all that stuff
41:32 - 41:39

but in a separate branch which you should use git-annex to access, but you can do it by hand if you want to
41:52 - 41:55

I'm not familiar with tracking branches,
41:55 - 41:58

yet you mention that the workflow for your wife has a different view of the data than you
41:58 - 42:07

with that workflow is it possible for your wife to upload photos that you will have in your view as well, or is it a oneway street?
42:07 - 42:13

minor correction: tracking branches track a different repository,
42:13 - 42:18

what I meant was rebasing branches, which rebase on top of a different branch
42:18 - 42:24

which basically just keeps the patches always on top of the branch, no matter where the head moves to
42:27 - 42:32

if she wanted to do that she would need to simply git checkout master
42:32 - 42:39

do whatever she wanted to do, and then git checkout her own branch, and then she's...
42:39 - 42:44

she is able to, but she would need to change into the master branch and then back
42:49 - 42:50

microphone
42:51 - 42:57

she never pushes her private branch? it always lives on her own machine?
42:57 - 43:02

no, she does push it, but I don't display this view of the data
43:03 - 43:08

because else she wouldn't be able to syncronize this view between different computers
43:08 - 43:12

I seem to have internet now, so let's just let this run in the background
43:14 - 43:15

any more questions?
43:23 - 43:25

no more questions?
43:27 - 43:27

than we...
43:27 - 43:29

? more minutes for questions?
43:36 - 43:41

ok, so thanks to Richard Hartmann, we will continue...

Title:: 1025_gitify_your_life.ogv
Video Language:: English
Team:: Debconf
Project:: 2013_debconf13

DrZaius edited English subtitles for 1025_gitify_your_life.ogv

English subtitles

Revisions

Revision 1 Uploaded

DrZaius

1025_gitify_your_life.ogv

Revisions

Our website uses cookies

Operating cookies (Required)