Saturday, November 26, 2016

KeePass Ultimate Setup and Security Guide

1 Introduction

Passwords are our gateway to interacting with the digital world. It's how we show that it's really us because no one else could know our password, right? Passwords are not perfect or very convenient to use but it's the only thing we have now. Better options are being researched, one of them could be the U2F token but for now we're stuck with passwords.

I heard people don't follow the best practices for safe passwords. And who's to blame? We are supposed to have strong passwords containing all kinds of crazy characters and different for each site. And everybody is using at least 10 sites on a regular basis plus around 100 other random sites they already forgot about. Humans can simply never remember 10 or more strong passwords and if they can, it's probably because they've been participating in memorizing competitions.

Let the computer remember things for you and you can forget all your passwords except one. Using a password manager (in this article I'm introducing KeePass 2), you can save all your passwords securely encrypted with a single master password. This master password will be long but you'll be able to remember it easily because you'll use it every day and it's the only one you need.

In this article I'll introduce KeePass 2, the open source password manager as well as a security analysis. So you can have concrete arguments explaining why it's secure. The first part of each section will explain how to use the password manager securely and is required reading. The second part will explain how the security works and you don't have to read it.

1 1 Security analysis

It's necessary to use a different password on different sites in case one of them gets breached (it did happen, LinkedIn, Yahoo, ...). If you're a hacker and need a password for a more important website, first try to compromise other services that person is using.
What if somebody compromises my computer and steals my unlocked password vault? That could happen but in that case they'll also have access to all your private files and even if you didn't use a password manager, access to websites you're already logged in to. Keeping your devices free of malware is always necessary.
"I still don't feel good about centralizing all my passwords in one place", you say. That is generally a sound security attitude but consider that your primary email account already centralizes access to most of your services because it's used for forgotten password reset.
For critical sites (such as email), it's best to also use 2 Factor Authentication.

2 Getting started

2 1 Download

The original KeePass 2 application is Windows only. It can be downloaded from this page http://keepass.info/download.html. Choose the Installer button on top right and wait a moment for the download to start.

Alternatively, download from https://www.fosshub.com/KeePass.html, choose "KeePass Installer, Professional Edition" (it's a strange name choice. Don't download the classic edition).

For a Mac, download KeePassX from https://www.keepassx.org/downloads and install in the usual Mac fashion.

2 2 Installation

When starting the installation on Windows, it should show a security window asking if you really want to install this program. This window MUST show Open Source Developer, Dominik Reichl. If not, do not allow it and delete the downloaded installer as you got a bad copy.

Security Analysis:

The project homepage as well as SourceForge mirrors don't have HTTPS. That's a bummer but the application files are digitally signed by the developer and the certificate is recognised by Windows. Therefore checking the digital signature provides stronger security than HTTPS. Furthermore, the FOSShub link is served over HTTPS.
The homepage for KeePassX does use HTTPS as well as the download. It does not have digital signatures but it can be downloaded from a website owned by the project's author and not a third-party (as is the case with sourceforge).

2 3 Choosing the master password

After you install the program, you can create a new database. Now is the time to create your master password.

Setting the password

This will be the main password that unlocks your database. It must be strong, stronger than your Facebook or banking password. It must be a new password, not something you were using before on a website. You must remember it well (try to type it a few times and then again the next day).

Your master encryption password needs to be really good. It should be at least 12 characters long but a better way is to pick a dictionary book and randomly pick 5 or 6 totally unrelated words. Maybe you can even combine multiple languages! "pasta blip port Bled nehmen" sounds good.

Setting "encryption difficulty"

After creating your database, you may want to go to File / Database Settings and then Security tab. Here, click the "1 second delay" link to properly set number of key transformation rounds. This is basically something like "encryption difficulty" and it increases the time taken to unlock the vault. A 1 - 5 sec delay is sufficient if you have a good password.

Don't forget to save you password vault file!

Security Analysis:

The problem with encryption passwords is that a potential attacker, after stealing your encrypted database, can just keep trying all possible words until they can crack it. Actually they'll program a computer to do it while they are having a beer. The computer can try alot of passwords per second.
Because of the danger of cracking the passwords, encryption tools also include a delay to slow it down. You can configure it in KeePass. The bigger delay and the better the password, the safer you are.
It's a good idea to increase the "encryption difficulty" 5 years later because computers will be faster in the future.

2 4 Settings

These settings are subjective and also depend on who can have access to your machine. This is what I would recommend for normal use. In Tools / Options:

Enable "Lock workspace after global user inactivity" and set it to 360 s or less.

Enable "Clipboard auto-clear time".

Enable "Lock workspace when computer is about to be suspended".

On the Interface tab, I like to enable "Drop to background after copying data to the clipboard".

2 5 Settings for KeePassX

This program is slightly different from the original Windows KeePass 2. Transform rounds ("encryption difficulty") can be set in Database / Database Settings. Again, you can click the Benchmark button to configure it to a recommended value.

Enable automatic locking in KeePassX / Preferences, on the Security tab.

2 6 Plugins

There are many plugins created by the community for KeePass. Currently I'm using none of them. Be careful because plugins can break security of KeePass and even their authors may not realize that. For example a browser integration plugin increases the risk quite a bit.

3 Day to day usage

Besides the security and cryptography, KeePass is a pretty ordinary program from a user perspective. Click Edit / Add Entry ... to add a new password entry. The program will automatically generate a new strong password for you so you only need to enter the site name and address (used by browser integration). Then click OK and File / Save to save the database.

To use a stored password, you have two options. The first one is to copy to clipboard (simply Ctrl+C) and paste in the website. The second option, which is slightly more convenient and slightly more secure is to use Auto-Type. Switch to your browser and place the cursor in the login form, in the user name field. Then switch to KeePass and select Perform Auto Type on the password entry. It will automatically log you in!

You can also create groups and assign icons to your entries but I think it's best to simply search for a site when you need it using the search box on the toolbar.

You can also use KeePass to safely store any other pieces of information such as bank PIN. It's not very suitable for storing files though. You may need to look at your OS' disk encryption or VeraCrypt.

4 Syncing the database

It's 2016, you probably have more than one computing device. Maybe you have too many of them. And you need to access your password database on all of them. This is where KeePass lags behind the commercial password vaults because you'll need to set it up by yourself. But don't worry, you can just use Dropbox or Google Drive ... or OneDrive or SpiderOak or any other file sync service you may already be using. Just put your password database in there and you're done.

Sounds insecure? Well the database is encrypted so if your password is good, your data is safe. Still feeling uncomfortable about it? You can add another factor - a keyfile. KeePass allows you to generate a file that is required to decrypt the database. You will then manually (using an USB stick) copy this file to any computer you want to use the password database on. Do not put it in Dropbox! Without the keyfile (and your password) there's no way in hell anyone could crack your encrypted database.

4 1 Step by step

Dropbox and Microsoft OneDrive will automatically sync any file you put in their special folder. Other similar services will probably do the same but I haven't used them.

First, add a keyfile to your password vault. If you already have created one, open it in KeePass and choose File / Change Master Key. In the dialog box here, enable both Master password and Key File. Type your master password again (don't need to change it). Then click Create to create a keyfile. Do not put this keyfile in your Dropbox. After finishing this, you can save the password vault to your Dropbox and it will be synchronized to your other computers using Dropbox.

Now you need to transfer the keyfile to your other computers. The best way to do this is offline, without using the internet. Copy the keyfile on an USB stick and use it to copy the file. Again, do not place the keyfile in the Dropbox folder. You should consider locking this USB stick safely to keep it as a backup of your keyfile. If not, don't forget to delete the keyfile off the USB stick before using it for something else.

Now you can use your password and the keyfile to open your password vault. The vault will be synchronized by Dropbox

4 2 Security Analysis:

If even a bit worried, use a keyfile.
If you lose your keyfile (or your master password), you won't be able to open the password database, ever. So write both on a paper and keep it at home, in a safe or something.
I'd prefer using a file sync service that supports file versions such as Dropbox or Google Drive. MS OneDrive can't.
Really, no one can break the encryption (AES algorithm). And if the NSA can, it'll cost millions of $$. Hacking your computer will be cheaper so that's what you should focus on next.
A practical way to delete the keyfile from an USB stick is to completely fill up the USB stick with other data (such as large movie files). Unfortunately it may not guarantee all traces of it disappear since flash chips may over-provision to make up for faulty portions. So the most secure way is to not use an USB but rather copy the file manually (it's just text and not that long).

5 Other password managers

Before KeePass I've been using LastPass. Together with 1Password, these seem to be the most established password managers at this time. Let me share some thoughts about how they compare. Note that the security analysis here focuses on the worst case scenarios and can sound a bit scary.

In terms of price and development model, KeePass is free and open source, LastPass is commercial but free for basic use and 1Password is fully paid. It's easier for security people to check the security of open-source software.

LastPass works as a browser plugin, same with 1Password. That's more risky from security point of view. For one, malicious websites might find some way to steal a password. KeePass is simple and isolated from the browser. Also, if a commercial password manager company changes management, gets sold or becomes subverted by a government, it could publish an update of its browser plugin that steals your data. That's a risk with all software that you use, including Windows or OS X. Again, KeePass is slightly smaller risk in this respect if you carefully check each update that you install.

For ease of use, the commercial programs may be more convenient. They take care of synchronization for you and 1Password is beloved for its user interface.

5 1 New password managers

While it's great that people try to innovate in the security area, I'd be always wary about new password managers until it's proven their developers know what they're doing. Security is not easy and a new product made by people without proper knowledge and experience can be a risk, even when the developers have good intentions.

6 Who am I to write about this?

I've been a software developer (a computer guy) longer than I can remember and in the past few years I've been focusing on cryptography engineering and security, studying and implementing cryptographic things at work. I found a crypto problem with a browser extension for KeePass. So I know enough to realize that I don't actually know enough yet! Also, I'm a level 45 crypto wizard ;)

Have I personally audited KeePass? Nope. But it's trusted by internet people and honestly, there's not that much to screw up since it's a rather simple program. I hope to take a look one day.

Friday, October 21, 2016

Crafting reliable C++

Everybody hates bugs [citation needed]. Why spend time hunting down bugs when you could be shipping code and earning money? This is especially true of low-level memory management bugs in C/C++. If you’re a newbie, words like memory leak, use-after-free or double-delete may not have an effect on you but any experienced developer will recoil at these words in horror or start chanting “asan, asan, asan, …” So, OK, we don’t want these bugs in our code, let them go somewhere else. This is especially true in my field of crypto engineering where bugs in crypto code or protocols can lead to complete failure of the projects. And bugs in ordinary programs can cause security problems just as well.
The weapons I choose for this battle are Modern C++, automated testing, static analysis from `the compiler, dynamic analysis tools, fuzzers and patience. Some people don’t like C++ or code C++ like it was plain old C. Sure, those Linux kernel programmers have a quantum computer in their head that lets them simulate all possible program paths at the same time to see where a lock or memory are not released. But I’m just a human and can barely keep track of the socks in my drawer so I have to find another way, one that is more fool-proof and doesn’t require so many socks.
So here are my top n tools and techniques to make my C++ less buggy, crashing less often and more reliable. It’s totally not a complete list but rather an introduction, a base level of tooling that everybody should know about but somehow not always does.

1 Deleting objects

C doesn’t have a garbage collector so we have to clean up our garbage manually. If we don’t, the program will eventually drown in garbage and run out of memory (but not before trashing the hard drive by swapping). The old and deprecated option is to do a delete obj; manually. This is completely unreliable (because it’s manual). You may forget to do this when having multiple exits from a method. Or when a method throws an exception. Or when another method you call throws an exception. Not to mention returning objects that the caller then has to free.
The modern approach is to use RAII which is a weird name for a simple concept: use the destructor to do all cleanup at the end of scope. If we have a scope with an object on the stack like this:

{
        VictorTheCleaner v;
        ...
}

then the destructor of v will be called when the scope is exited and it can take care of any deleting, releasing and cleaning that’s necessary. A scope can be exited in several ways:

normal program flow reaches the } brace
a statement such as return, breakor continue causes program flow to jump out
an exception is thrown

These are cases where you would have to manually do a delete and where the destructor will do it for you. The most common usage is a smart pointer that “owns” a non-smart (stupid) pointer and is responsible for deleting it. Another usage can be making sure you close a database connection. Or close a file. Or a network socket. Or change the process current directory to what it was before. This is equivalent to the using clause in languages such as C# and Python. C++ doesn’t have a dedicated keyword, instead we use the destructor.

1 2 unique_ptr

The standard library template class std::unique_ptr<T> is designed to take care of the most common case - when you need to automatically delete an object. Use std::unique_ptr<T>. Use it for function local variables, use it as object members (to be safe if your constructor throws). Use it especially for C land objects that are created with functions like EC_POINT_new() and must be deallocated with EC_POINT_free(). This is how you can set a user defined function as the deleter:

template<typename T, void (*Fn)(T*)>
class function_deleter {
public:
    void operator()(T *p) {
        if (p != NULL) Fn(p);
    };
};

template<typename T, void (*Fn)(T*)>
class unique_ptr_ex {
public:
    typedef std::unique_ptr<T, function_deleter<T, Fn>> type;

    // do not instantiate this class, use unique_ptr_ex<T, Fn>::type
    unique_ptr_ex() = delete;
};

The first class defines a functor (something that has operator()). This is necessary because the second template parameter of unique_ptr is a type. The purpose of the second class is to act as a typedef with a parameter. It’s a workaround because not all compilers I’m using support the new template typedef in C++11. These two classes simplify creating unique_ptr with different cleanup functions:

unique_ptr_ex<BIGNUM, BN_free>::type m_privkey;
unique_ptr_ex<EC_POINT, EC_POINT_free>::type m_ec;

Then just initialize m_privkey with a new BIGNUM that belongs to you and deallocation using the openssl-provided function BN_free is taken care of automagically. Same for m_ec!
For details how to use unique_ptr, see the reference. I’ll try to give a few basic guides here.

Use it for local variables that you created in the function and need to clean in the same place.
Use it for member variables of a class that “owns” these variables. Owning them means that the class is responsible for cleaning them which typically happens when the class itself is deleted.
You can also use them as return types for functions that create an object and pass its ownership (responsibility for deleting) to the calling function. No more uncertainty on who should do the cleaning, unique_ptr tells you you are the responsible owner and will do it automatically for you too.
Do not use it for object pointers that do not transfer ownership. If I call a function that uses an existing object, I use a naked pointer or a reference (if it can’t be null).
Do not use it for member variables of a class if the class is not owning that object.

The last case is a little tricky. In some cases, more than one class are responsible for a cleanup job and that’s where a smarter smart pointer such as shared_ptr would come in. I try to keep things simple and have only one owner for each object so that I can use unique_ptr.

1 3 Other cleanup

Sometimes you need to do more things besides deleting an object upon scope exit. Maybe you need to close a database connection, restore the original value or roll back an object to a previous state.
You could create a new class with the appropriate code in destructor for each of these cases (for example PostgreCloser, PwdRestorer, …) but that is a little inconvenient. That’s why there is ScopeGuard, a class where you can redefine the cleanup code in-place.

2 Program invariants

A short intermission is necessary to explain the term invariant. The term literally means “something that doesn’t change” and in programming that will be a condition (a logical statement) that mustn’t change and be always true while the program is running because the code relies on it and things would break otherwise. There can be invariants on particular lines in the code (at the start of a function, in a loop) or they are related to a data structure or an OOP class. Invariants are usually not expressed in the programming language itself, it’s just something that we keep in mind and use when thinking about how the code will behave. Or at least should keep in mind. Ideally.
For example a data structure invariant for binary search tree is that each node has at most 2 children. A more interesting invariant requires that nodes under the left child are all less than the current node and all under the right child are more than the current node. If this condition doesn’t hold then efficient search in binary search tree will be broken.
A string splitting function will have an invariant at the end of the function (called post-condition) that says that the two returned parts, actually form the original string if reassembled. A splitting function would not otherwise be very useful. A memory allocator will have an invariant stating that all active allocations are kept track of so that a further allocation cannot occur in a piece of memory already given to someone else.
Invariants help us describe program behaviour and requirements using logic. At this time, mainstream programming languages don’t have any support for working with them. But there are languages that focus on correctness in the academic research community and they let you write those invariants in the code for formal checking.

3 Error handling

I’m coding a Modern C++ interface around some library from C land and using exceptions to signify any errors because I don’t want to deal with manual propagation of error codes up the stack and I always want to know when an error happens[ (see here)]](https://www.securecoding.cert.org/confluence/display/cplusplus/ERR02-CPP.+Avoid+in-band+error+indicators). Manual error code propagation clutters the code, makes it harder to grasp and is prone to human errors. Modern C++ / OOP programming encourages proper object initialization in the constructor (as opposed to an additional Init() method) and exceptions are the only way to report errors from there (also note this Boost article).
Now some people don’t like exceptions in C++ but I found that most (not all) of their arguments are based on limitations in the old versions of the language or simply ignorance of how exactly the language works. Be sure to get familiar with recent development in the C++ language, most significantly the RAII pattern. Of course exceptions have some drawbacks too and some properties that need to be kept in mind (or your quantum computer brain):

Throwing and catching exceptions is s……l..o…w. If you expect this to happen often, for example in parsing code, you need error codes (or Expected, see below) see comparison.
Throwing an exception in a destructor is very destructive. It will probably crash your program (see here for an example).
Throwing an exception in the constructor means the object construction was cancelled and destructor won’t be called. That makes sense but also could mean that a delete m_obj; in the destructor may never be invoked (again, example here) even though you have already new’ed it in constructor. That is one more reason to use unique_ptr for member variables since these variables will be protected against a sudden death by exception from constructor.
Only one exception can be in flight for a thread at a moment. This means that the exception model cannot support async callback based programming aka. Node.js or Python Twisted and you need to store exceptions manually (mentioned in this talk).
Exceptions can appear anywhere and there’s nothing in the code that will warn you about them.

If exceptions are not an option, you’ll be returning error codes. I’d suggest a slightly more sophisticated approach. The Expected type can be returned by functions that are supposed to produce a result but could also fail. It has type parameters for the result value as well as for the error. Then (for example) a parsing library could use the following interface:

struct ParseError {
    int line, col;
    std::string expected;
};

Expected<int, ParseError> parseInt(std::string);
Expected<int, ParseError> parseHex(std::string);
Expected<Url, ParseError> parseUrl(std::string);

The Expected class also seems to support the error monad programming style but that would be for another day :)

3 1 Error mis-handling

Beginners tend to ignore errors, I still remember I was doing it. I could barely manage to write the code to do what I wanted in the first place. But if we’re talking about reliable software, ignoring errors is unacceptable. Quite the opposite, we need to know each and every error that happened, is happening or may happen.
Errors that have happened need to be logged using an appropriate logging framework and in some cases may even be stored in a database for further analysis or sent to a remote monitoring server. See more here Exception Driven Development
Errors that are happening right now need to be detected. If calling an external library that doesn’t throw exceptions but returns an error code, check the returned code and throw, log or handle (retry?) the problem! From this point of view, exceptions (as opposed to returned error codes) really help because they are not ignored by default so the risk of forgetting to report an error is lower. But then again some newbies will very carefully put a catch block around each function so that they can ignore the valuable exception object that bears details about the problem.
How about errors that may happen in the future? Try to anticipate possible problems in the future but don’t try to recover, auto-repair or anything like that. Instead, check invariants and assumptions about your data structures consistency using assertions that report problems immediately. For example, if you have a class that is not thread-safe and you designated it to be only accessed from a single thread, assert it (this is a trick I found in Chrome source code):

void Gui::UpdateBlinkies() {
    assert(GetCurrentThread() == MainThread);

    m_blinkie ++;
}

The point is, again, to discover any problems, inconsistencies and unexpected situations as soon as possible because you are better able to debug and fix the problem. If the program crashes 10 minutes after the problem started, how can you trace the crash 10 minutes back to its original cause?
Having some hard-to-debug problem that a client reports but you can never see on your machines? Then you should have a log file that provides additional information. If even that doesn’t help, rather than spending a week trying to reproduce the problem on your machine, you can ship a debug build to the customer that collects information that you need or one that has enabled asserts. If you have used them well, that alone may be able to pinpoint the bug.

3 2 Exception safety

But no matter if you choose exceptions or error codes to handle those unusual unhappy cases, you still need to be careful about exception (or error safety). This means that you need to
1) release any resources that were acquired before an error
2) return the application into a consistent state (invariant safety)
Everybody should be pretty familiar with point #1 where doing some new BigObject() must be always followed by a delete even if you get an exception in between.
Point #2 is similar except that it is specific to your application invariants.
For example, if you are keeping some data in two structures and always need to update both of them on inserting, you need to make sure both things happen (or get rolled back) even if an exception is thrown:

void insert_both(string a, string b) {
    m_by_name.insert(a, b);
    // OMG, what if an exception happens here?
    DoSomethingElse();
    m_by_addr.insert(b, a);
}

Handling this case could be still easy, just add a catch, remove the item and exit:

void insert_both(string a, string b) {
    m_by_name.insert(a, b);
    try {
        DoSomethingElse();
        m_by_addr.insert(b, a);
    } catch (...) {
        m_by_name.remove(a, b);
        throw;
    }
}

But if you need to do something like this twice in a function, it starts to get complicated. Fortunately, C++ provides an elegant and convenient way to take care of both requirements. Memory safety has already been described (remember unique_ptr). Invariant safety can be done with a ScopeGuard which is a more flexible alternative to unique_ptr. There’s an implementation in the Facebook’s folly library. For the above example with two maps, you could use it in the following way:

void insert_both(string a, string b) {
    m_by_name.insert(a, b);
    ScopeGuard insert_guard = makeGuard([&] { m_by_name.remove(a, b); });

    DoSomethingElse();
    m_by_addr.insert(b, a);
    insert_guard.dismiss();

If any exception occurs in the code, the insert_guard will execute the remove operation to restore the original state. If everything goes smoothly to the end of the function, the scope guard will be cancelled by the dismiss() call.
This way you can have nice linear code which is easy to understand even if there are more than 1 rollbacks. Just imagine the scope guard as “do this cleanup if anything goes wrong down there” as opposed to having nested try-catch clauses with many possible combinations of control flow.

3 2 1 Testing exceptions

When we test our code, we usually focus mostly on the ‘happy path’ where everything goes as planned and the edge cases receive less attention. But if we want to have truly reliable code, even those error or edge cases deserve some attention. If you use code coverage tools, they will keep flashing their red warnings in the exception handlers at you until you add them to ignore list (err, I mean, fix them).
Proper unit testing (covered later) of course requires also testing the error and edge cases. You should try to come up with possible incorrect inputs (or problematic program state) and know, for each of them, how the program should handle it. And write this down in an unit test. In this way both the “happy” and failure behaviours of the function are well documented and verified to be correct.
This approach for unit-level testing is well established. On the more coarse scale, there is another technique where we artificially throw exceptions and check that they are handled appropriately. We can throw exceptions at various places in the program and check general properties such as whether it causes memory leaks, memory corruption, or crashes. This is only relevant in languages such as C++ which have those memory issues by default.
To automate this, you would put instrumentation points at interesting places in your program. Then you run your program or test suite over and over, triggering these instrumentation points in sequence. If each run of your program is deterministic (it takes the same path each time), you will have triggered each of the N points in the end, after running the test suite N times.
Since this is sorts of mass exception injection approach, we cannot test for specific behaviour of specific cases, only for overall response to exceptions. Memory correctness will be the most typical case. Another one could be ensuring that all those exceptions are properly logged. This method is very useful if you’re creating a binding to another programming language such as Java or Python or even plain old C. Typically you need to catch exceptions in the C++ world and translate them somehow into exceptions or at least error codes in the target language without messing up the memory or exception safety.
You also need to run this under a memory checker such as Valgrind, Asan or PageHeap which will inform you if any memory leak or access violation occurred. If all goes smoothly, you’ll know that exceptions can’t mess with you. It also probably means that you used RAII and unique_ptr correctly because without them it’s hard to make memory management right in the face of exceptions.
This approach has also been described in Exception-Safety in Generic Components.
This is how you may implement it:

// Once placed in code, it can be redefined to do different type
// of instrumentation such as heap consistency checking.
// NOTE: most likely, this should be disabled in release build
#define INSTRUMENTATION_POINT { g_instrument->RunPoint(); }

class ExceptionInstrument {
public:
    ExplosiveInstrumentator();
    static ExplosiveInstrumentator &instance();
    void dispose();

    bool should_throw();
    void maybe_throw(const std::string &file, int line);
    static void instrument(const std::string &file, int line);

    void next_run();
    void set_run(int no) { m_run_id = no; }

private:
    // singleton
    static ExplosiveInstrumentator *g_instance;

    std::string get_filename_base(const std::string &path) const;

    int m_run_id;
    int m_counter;
    int m_threw_cnt;
};

void ExceptionInstrument::dispose() {
    if (g_instance != NULL) {
        delete g_instance;
        g_instance = NULL;
    }
}

void ExceptionInstrument::maybe_throw(const std::string &file, int line) {
    string basename = get_filename_base(file);
    stringstream ss;
    ss << "Instrumentation exception F " << basename << " L " << line;

    if (should_throw()) throw std::runtime_error(ss.str());
}

void ExceptionInstrument::instrument(const std::string &file, int line) {
    instance().maybe_throw(file, line);
}

ExceptionInstrument &ExceptionInstrument::instance() {
    // NOT THREAD SAFE
    if (g_instance == NULL) {
        g_instance = new ExplosiveInstrumentator();
    }
    return *g_instance;
}

std::string ExceptionInstrument::get_filename_base(const std::string &path) const {
    string::size_type bk_pos = path.rfind('\\');
    string::size_type fw_pos = path.rfind('/');
    string::size_type pos;
    if ((bk_pos != string::npos) && (fw_pos != string::npos)) {
        pos = std::max(bk_pos, fw_pos);
    } else if (bk_pos != string::npos) {
        pos = bk_pos;
    } else if (fw_pos != string::npos) {
        pos = fw_pos;
    } else {
        return path;
    }

    return path.substr(pos);
}

bool ExceptionInstrument::should_throw() {
    bool res = false;
    if (m_counter == m_run_id) {
        res = true;
        m_threw_cnt += 1;
    }
    m_counter += 1;
    return res;
}

void ExceptionInstrument::next_run() {
    // this is called from Java, do nothing if instrumentation is disabled
#ifdef ENABLE_INSTRUMENTATION_THROW
    if (m_threw_cnt == 0) {
        cerr << "Instrumentation run " << m_run_id << " threw no exceptions." << endl;
    }
#endif
    m_counter = 0;
    m_threw_cnt = 0;
    m_run_id += 1;
}

Another technique is using what I call explosive mocks. They work as ordinary mocks but throw a random exception when called. They may help you test correct handling of exceptions that you didn’t expect when first writing the code. For example in connecting to a network API, all kinds of things can go wrong, from network, DNS problems to authentication, API changes, invalid parameters, …). It’s not a very systematic method but can be useful as exploratory testing to find bugs.
QUESTION: how to handle exceptions in message loops / GUI / … in a way that is debuggable, readable, testable?

4 Automated testing

I use automated unit tests and (more or less) automated integration tests where it makes sense. This is a big and complicated topic and I have an article with a few of my observations in progress. Make sure to keep your interpipes to this blog clean so you don’t miss it ;)

5 Tooling for correctness

In no way complete or sufficient but this is what I use.

5 1 Address Sanitizer (+ more)

If you write in C++ (or god-forbid, C), you’re going to have memory bugs. Well, unless you are using Address Sanitizer or Asan. This kind of bugs can cause your program to crash, which is a little annoying to see. But in some cases a crash can lead to a Remote Code Execution exploit. RCE is basically when a snake whisperer (hacker) convinces your program to start executing some code the hacker offered.. So yeah, that’s a little less convenient, especially when they use it to steal your money or data.
Asan helps you catch these bugs. Also memory leaks. It can’t detect every problem with your code, it only detects problems in code that you execute. So you still need a good test suite. It catches every access violation (segfault for unix folks) and prints beautiful coloured output in the console. And since everybody loves coloured console output (and not having segfaults), I hereby endorse using Asan for everything.
Originally developed for the clang compiler, it’s now available for gcc as well (sorry for people stuck with gcc 2.x)(I wonder if that gcc version is written on papyrus?). Detailed usage is here but in short, you will need to create a new build variant for your project that will generate an Asan-instrumented build with the -fsanitize=address flag. This build will be around 2x slower and will abort immediately when an access violation is detected. It does not report false positives, that abort will be something you’ll have to fix.
You may have used Valgrind. For memory access violations, Asan is similar but works better. It does require you to recompile the code with Asan enabled but then it’s much more accurate.

Asan output

5 2 afl-fuzz

So even though awesome, Asan won’t catch problems in obscure code branches that don’t get executed. One thing you can be sure: hackers will try to find them and run them so that they can pwn your machine and steal your candy. Here come fuzzers, tools that are designed specifically to execute code paths that normally never see the light of day. They do it by running your code, like, a million times, each time with a slightly different input and observing whether it caused some different behaviour in your code.
This is mostly suitable for programs that read and parse some input files such as images, videos, PDFs or even antivirus software reading .exe files.

See, it has colours

afl-fuzz is one such program, free, open source and pretty good. It’s not complete magic though. You need to adjust your project to do nothing but read the input data and sometimes you need to help the fuzzing process a bit with a hint. afl-fuzz works by inserting instrumentation during compilation that informs the fuzzer where the control flow is going. Based on this instrumentation, it tries to alter the input data to find new control flow paths. And when it finds a control flow branch that crashes your code, it’ll happily returns the bad input. Programmers will take that sample input to go and fix the bug, hackers will take that sample to develop an exploit.

5 3 Catch

As far as automated testing frameworks for C++ are concerned, there’s quite a choice. You’ve got the one from Google, Boost, even Visual Studio comes with one. I can’t compare them but I tried Catch and enjoyed it very much. So besides the #1 requirement of coloured console input, it has the benefit of being very light and easy to include in the project (just one header file!) and very easy to use.
To assert, you would simply write

REQUIRE( factorial(2) == 2 )

and it will automatically deconstruct it into two sides of the == operator, showing expected and actual if it doesn’t match. Testing this way is much more natural than the classic Assert.AreEqual(factorial(2), 2) or even Assert.That(factorial(2)).Equals(2) or whatever the latest fad in fluent interfaces is.
The BDD style for organizing tests has been the most convenient from what I’ve seen so far. You can have a hierarchy of test conditions, delimited using SCENARIO, GIVEN, WHEN, THEN and at each step of the hierarchy, you can set up some objects that will be used in levels below. The test framework will then take this tree and run each path independently. Let me give an example with a completely imaginary API:

SCENARIO("web service test", "[web][http]") {
  WebServiceFake fake;
  RestClient client(fake.url());

  GIVEN("authenticated client") {
    client.user("pete");
    client.password("abcd");

    WHEN("makes request about self") {
      Request r(client.new_request("/user/pete");
      r.get();

      THEN("gets its data") {
        REQUIRE(r.json().get("salary") == 123);
      }
    }
  }
  GIVEN("guest client") {
    WHEN("makes request about pete") {
      Request r(client.new_request("/user/pete");
      r.get();

      THEN("gets nothing") {
        REQUIRE(r.json().count() == 0);
      }
      THEN("error is reported") {
        REQUIRE(r.status() == 403);
      }
    }
  }
}

In this example, both “authenticated client” and “guest client” will be run separately, with a fresh instance of client object each time. In all but the most basic unit tests, we have to deal with setting stuff up and this layered structure is really helpful because it helps avoid duplication while putting the code where it’s easy to see.

5 4 PageHeap

While I believe there are plans to make Address Sanitizer available for Windows, at the time of writing that port was not yet ready. PageHeap is a debugging tool built into Windows that can be used to detect buffer overflow errors. It’s not as versatile as Asan but it also helped save my code’s neck a few times (was particularly useful to catch a bug at the boundary of C# and C++ code). It doesn’t require you to recompile the code, you just enable it for a particular program using gflags.exe available with the Windows SDK. It works by putting each allocation at the end of a virtual memory page which allows the OS to catch any access over the page boundary.

5 5 Other tools

WinDbg is a very powerful debugger with bunch of scripts and extensions available. For source code based debugging, Visual C++ is pretty sufficient because you can see everything. But WinDbg sure comes handy when you don’t have the code and need to debug issues outside your own code or have problems calling closed source or system libraries. On a second thought, you probably don’t want to end up digging there unless you enjoy this kind of self-punishment.
radare2 looks pretty rad for digging in assembly. Sadly I didn’t have much time to play around with it. Yes I seem to enjoy this kind of self-inflicted pain.
rr is a project from Mozilla that lets you record a program run and then debug the bug out of it by running it over and over and over until you find it.
F* is the absolute heavy-weight here. It lets you write code, prove that it’s absolutely correct and then transalte it to C/C++. Except for the part where you have to be a genius to prove the correctness of any larger program.

6 That’s it?

I’ve tried to compile my my approach to not shooting yourself in the foot while coding in C++. Note that while it’s not exactly short, it still doesn’t cover everything, for example how not to shoot yourself in your hand, knee or the back of your neck.
The story is not over though, perhaps you, dear readers, can reveal some tricks you have up your sleeve? Discuss!

Monday, October 17, 2016

Digging into browser CSPRNG

Browsers nowadays support the window.crypto.getRandomValues() API for obtaining cryptographically secure random number values suitable for generating private keys and session tokens. And while it’s questionable if in-browser JavaScript crypto is really secure (it still requires a flawless TLS configuration. Forget about encrypting stuff without HTTPS enabled), more clients and customers ask me for implementing crypto in the browser. Oh well.

Here is a quick review of how the getRandomValues() API is implemented in open source browsers, as of 18 October 2016, so that we can be sure that nothing shady or lame (such as running current time through the Mersenne Twister) is going on.

WebKit

Here it’s pretty straightforward. The first function file is the implementation of the API and it goes directly to the OS.

Chromium

Chromium has their own fork of WebKit but essentially it’s the same story. It uses a global random number generator in base namespace.

Firefox

Uh … here it, uhm… complicated?

The journey starts in the cpp file responsible for that JS function:

https://hg.mozilla.org/mozilla-central/file/tip/dom/base/Crypto.cpp

Here it’s invoking the random number generation service, “@mozilla.org/security/random-generator”. Well, uh, hope it can’t be overriden from chrome JavaScript or something.
The service implementation is here and it calls PK11_GenerateRandomOnSlot:

https://dxr.mozilla.org/mozilla-central/source/security/manager/ssl/nsRandomGenerator.cpp

This one calls C_GenerateRandom here (or possibly other PKCS11 implementations, https://dxr.mozilla.org/mozilla-central/search?q=C_GenerateRandom)

https://dxr.mozilla.org/mozilla-central/source/security/nss/lib/pk11wrap/pk11slot.c

There is a deterministic random byte generator here. It calls RNG_SystemRNG once on boot to init its internal state.

The windows implementation calls RtlGenRandom instead of CryptGenRandom which is the official
CSPRNG API on Windows. Although the docs don’t say it is crypto-safe, it is used by rand_s from the CRT and that is documented to be crypto-secure.

Live Action

Then I attached a debugger and put a breakpoint at the critical points. And ran the crypto JS API in a loop. Here we can see it goes through the path as expected:

>   freebl3.dll!prng_generateNewBytes(RNGContextStr * rng, unsigned char * returned_bytes, unsigned int no_of_returned_bytes, const unsigned char * additional_input, unsigned int additional_input_len) Line 338   C
    freebl3.dll!prng_GenerateGlobalRandomBytes(RNGContextStr * rng, void * dest, unsigned int len) Line 642 C
    freebl3.dll!RNG_GenerateGlobalRandomBytes(void * dest, unsigned int len) Line 659   C
    nss3.dll!PK11_GenerateRandomOnSlot(PK11SlotInfoStr * slot, unsigned char * data, int len) Line 2247 C
    xul.dll!nsRandomGenerator::GenerateRandomBytes(unsigned int aLength, unsigned char * * aBuffer) Line 37 C++
    xul.dll!mozilla::dom::Crypto::GetRandomValues(JSContext * aCx, const mozilla::dom::ArrayBufferView_base<&js::UnwrapArrayBufferView,&js::GetArrayBufferViewLengthAndData,&JS_GetArrayBufferViewType> & aArray, JS::MutableHandle<JSObject *> aRetval, mozilla::ErrorResult & aRv) Line 105   C++
    xul.dll!mozilla::dom::CryptoBinding::getRandomValues(JSContext * cx, JS::Handle<JSObject *> obj, mozilla::dom::Crypto * self, const JSJitMethodCallArgs & args) Line 70 C++

And seeding the DRBG from the OS once on startup:

>   freebl3.dll!rng_init() Line 419 C
    nss3.dll!PR_CallOnce(PRCallOnceType * once, PRStatus(*)() func) Line 779    C
    freebl3.dll!RNG_RNGInit() Line 495  C
    nss3.dll!secmod_ModuleInit(SECMODModuleStr * mod, SECMODModuleStr * * reload, int * alreadyLoaded) Line 232 C
    nss3.dll!secmod_LoadPKCS11Module(SECMODModuleStr * mod, SECMODModuleStr * * oldModule) Line 480 C
    nss3.dll!SECMOD_LoadModule(char * modulespec, SECMODModuleStr * parent, int recurse) Line 1537  C
    nss3.dll!SECMOD_LoadModule(char * modulespec, SECMODModuleStr * parent, int recurse) Line 1572  C
    nss3.dll!nss_InitModules(const char * configdir, const char * certPrefix, const char * keyPrefix, const char * secmodName, const char * updateDir, const char * updCertPrefix, const char * updKeyPrefix, const char * updateID, const char * updateName, char * configName, char * configStrings, int pwRequired, int readOnly, int noCertDB, int noModDB, int forceOpen, int optimizeSpace, int isContextInit) Line 436   C
    nss3.dll!nss_Init(const char * configdir, const char * certPrefix, const char * keyPrefix, const char * secmodName, const char * updateDir, const char * updCertPrefix, const char * updKeyPrefix, const char * updateID, const char * updateName, NSSInitContextStr * * initContextPtr, NSSInitParametersStr * initParams, int readOnly, int noCertDB, int noModDB, int forceOpen, int noRootInit, int optimizeSpace, int noSingleThreadedModules, int allowAlreadyInitializedModules, int dontFinalizeModules) Line 638   C
    nss3.dll!NSS_Initialize(const char * configdir, const char * certPrefix, const char * keyPrefix, const char * secmodName, unsigned int flags) Line 812  C
    xul.dll!mozilla::psm::InitializeNSS(const char * dir, bool readOnly, bool loadPKCS11Modules) Line 976   C++
    xul.dll!nsNSSComponent::InitializeNSS() Line 1742   C++
    xul.dll!nsNSSComponent::Init() Line 1948    C++
    xul.dll!nsNSSComponentConstructor(nsISupports * aOuter, const nsID & aIID, void * * aResult) Line 174   C++
    xul.dll!nsComponentManagerImpl::CreateInstanceByContractID(const char * aContractID, nsISupports * aDelegate, const nsID & aIID, void * * aResult) Line 1203    C++
    xul.dll!nsComponentManagerImpl::GetServiceByContractID(const char * aContractID, const nsID & aIID, void * * aResult) Line 1561 C++
    xul.dll!nsCOMPtr_base::assign_from_gs_contractid(const nsGetServiceByContractID aGS, const nsID & aIID) Line 103    C++
    xul.dll!nsCOMPtr<nsINSSComponent>::nsCOMPtr<nsINSSComponent>(const nsGetServiceByContractID aGS) Line 541   C++
    xul.dll!EnsureNSSInitialized(EnsureNSSOperator op) Line 196 C++

Conclusion

In summary, the browsers behave as expected, providing random numbers seeded by the OS crypto-safe random number generator.

Thursday, June 2, 2016

My list of high-quality online resources

The internet is a tremendous library of information but finding the signal among all the noise is hard work. I think everybody gradually builds their own go-to list of trusted sites and sources and I think it would be a great idea to share so that we can all benefit.

Health & Nutrition

Health information is critical to be correct and unfortunately one of the most likely to be full of bulls**t and charlatans. I'm trying to follow only properly scientifically grounded sources.

US National Library of Medicine MedlinePlus
Authority Nutrition Blog: ok, this one is just a blog and has advertisements. But it's better than others because it links to studies and if you only take away that sugar is bad for you, and low-fat yoghurt is a scam, it'll be helpful.
Sometimes I ask my kungfu sifu :)

Note: never use Google to search for health issues. As you probably know, Google and other big companies are collecting information about all their users and your health issues is information you really don't want anyone to know. Instead, use the Privacy / Incognito mode in your browser and search using Duck Duck Go or Disconnect.Me.

Privacy & Security

Decent Security: basic tips for Windows users
Google Safety Center: while we know that Google is after all your private data, they are also doing a good job of preventing any malicious 3rd parties from getting it from you
Signal App: one of the only easy & secure ways to communicate completely privately. Android, iOS and desktop supported

Technology & Science

Mozilla Developer Network: the ultimate web technology reference. Forget about w3schools, it's outdated or even incorrect
S. Hanselman's Ultimate Tool list: not only for developers
Cosmos: A Spacetime Odyssey: a modern documentary TV series which explains a range of scientific topics. Absolutely beautiful visually with a catching commentary from Neil deGrasse Tyson
National Geographic: with a long tradition of documenting and protecting the environment. Unfortunately it was just bought by Fox News last year :(
Earth Wind Map: A cool map of the weather. I didn't verify whether their data is correct.
GoodReads: At first I thought just another Social Network for X is not useful but I was wrong. This is a perfect place to steal ideas on what to read from friends with similar interests. And of course, books are still the ultimate fountains of knowledge.

Language & Writing

Corpus Of Contemporary American English: need to check if a certain phrase makes sense in real world English? Just enter it here and avoid making a fool of yourself with made-up expressions
no other resources, that's why the writing in this blog sucks so much

Environment & Charity

Charity Navigator: you want to contribute to a change to the world but not sure if a certain charity is real and is using your money properly? Charity Navigator can help to shine some light into its internal operation.

Friday, March 11, 2016

How to debug neovim python remote plugin

I really really really want to have debugger integration with my Vim setup and while the plugins for old Vim were a little wacky, the new architecture of NeoVim seems promising, so I decided to give lldb.nvim a go.

It didn't work. This is a (~~epic~~ boring) story of how I debugged and fixed the issues.

Step 1: update

Update your neovim to the latest release to avoid fighting issues that have already been solved. At the time of writing, I used:

nvim 0.1.2 from Homebrew
OS X 10.10.5
XCode 7.0
lldb-340.4.70

Step 2: Diagnose

PyThreadState_get error

If you're on OS X, chances are you have more than one Python version installed and that's where the trouble comes from. If you get this error message

>>> import lldb
Fatal Python error: PyThreadState_Get: no current thread

it's most likely because you're trying to import a module that has been linked with a different version of Python. The lldb module comes with the XCode developer tools and was linked with the default system version of Python which lives in (remember this)

/System/Library/Frameworks/Python.framework/Versions/2.7/bin/python

so this is the python version you should use to run the lldb.nvim remote plugin. On my system, BTW, lldb module lives in

/Applications/Xcode.app/Contents/SharedFrameworks/LLDB.framework/Resources/Python

Step 3: Install neovim Python module

The neovim module has probably already been installed with neovim but perhaps not in the correct Python version. You can try to import neovim in the system Python. If it fails, you'll need to install it using easy_install or pip:

sudo /System/Library/Frameworks/Python.framework/Versions/2.7/bin/python -m easy_install neovim

This will install the neovim package into the system Python distribution (needs sudo) using the easy_install tool.

Step 4: Configure neovim to use the system Python

In your neovim config file, add this line:

let g:python_host_prog = '/System/Library/Frameworks/Python.framework/Versions/2.7/bin/python'

This ensures that neovim will start the system Python (which has access to lldb and neovim modules) to host the plugin. After this, you should be all set!

Step 5: Using $PYTHONPATH?

If you do use $PYTHONPATH with your non-system Python, you'll have trouble as well. Before launching the system Python from nvim, you'll need to clean this variable otherwise the packages will interfere with the system Python's packages.

I do that using a small wrapper script ~/syspython2 which gets invoked from nvim as the g:python_host_prog

#!/bin/sh
# running the OS X system python. Required to import the lldb module.
export PYTHONPATH="/Applications/Xcode.app/Contents/SharedFrameworks/LLDB.framework/Resources/Python"

# enable these for debugging
#echo "--" >> ~/syspython2.log
#echo "$@" >> ~/syspython2.log
/System/Library/Frameworks/Python.framework/Versions/2.7/bin/python "$@"

Step 6: Diagnose

Still having trouble?

Don't forget to run :UpdateRemotePlugins
Enable logging in the ~/syspython2 script
Check using pstree | less if neovim is launching the correct Python binary
Double-check you can import neovim and lldb modules from the system Python
Make sure lldb.neovim is installed correctly - the file lldb.nvim/rplugin/python/lldb_nvim.py must exist
NeoVim also tries to load Python 3 plugins, you may need to do the same for Python 3
Try to debug /usr/local/Cellar/neovim/0.1.2/share/nvim/runtime/autoload/remote/host.vim using debugging vim methods
More info about lldb Python module here on StackOverflow

Tuesday, February 23, 2016

Things we don't have in Europe, part II.

With over 3 years spent in Hong Kong, I can share some more minor differences compared to Europe or Czech Republic that I've noticed. I've also noticed that while some things are different, other things are utterly same the everywhere: the abundance of lazy or stupid people.

Note: people seemed to like the Part I as well.

Anti-pandemic measures and crowd control

How do you know you're in Asia? Well just look around you and if there are more than 10 people in your 1m x 1m personal space then you're probably in an Asian city. A high population density increases the damage from any infectious disease and Hong Kong has learned as much when SARS hit. So now they're trying to curb the spreading of diseases by disinfecting lift buttons, escalator handles and door handles multiple times a day. Or that's what they claim, anyway. Furthermore, posters in public places are asking people to wash hands properly and refrain from spitting.

They also have a lot of experience in crowd control. On Halloween and other important party days, the entire bar street is closed and only a limited stream of people can get in. And even then the place is absolutely packed.

Unusual names

Children in HK are asked to choose an English name for themselves in school. That name is then used more often than the original Chinese name and it really is much easier for foreigners to remember because learning the correct pronunciation of a Chinese name can take a week (in my case). People from the mainland often don't choose an English name so I'm having a harder time with their name.

Anyway, I've found that HKers are much less conservative in choosing names than we are in the West. It is taboo in EU or US to pick a name outside of a pre-defined set of names. Not here. Using the name of a city or a proper noun is possible. I've even heard stories that some guy picked the name "Chocolate Milk". I'm not sure if these people realize it'll disadvantage them in dealing with Westerners because for us, such names sound silly and it ruins the first impression. But it makes one also realize how many arbitrary rules does our own culture impose.

I admit that choosing Never Wong as your name is just pure genius.

Octopus card

I love the Octopus card. Similar ^[1] to the UK Oyster card, it stores value and while primarily used for public transport, you can use it in many other places such as restaurant, convenience stores, vending machines, ferries and even as ID for building entrance. Payment is instant and refilling stored value is possible almost everywhere. If you're coming to HK for more than 3 days, don't even think about using one-off subway tickets, just get the Octopus card. Thanks to this, you can almost get rid of those annoyingly heavy coins.

Dining culture

The #1 pastime in HK must be ... eating. Hong Kong may not have as muny arty shows and culture compared to Paris or New York but what you can do every evening is trying a new restaurant. With all of the world's cousine available in thousands of restaurants around the city, there's always something new to explore with your tastebuds. The way to socialize with your buddies is not getting a beer but rather going out for a dinner. And after the dinner you may continue to a dessert shop where you'll get some Chinese style, fruit, tofu and jelly based desserts. Who cares that eating sweets when you're already pretty full at 10pm may not be the healthiest thing. The naturally slim Asians are not worried.

When not eating in a fancy restaurant or when you're at home, you may find that the table has a big sheet of plastic bag instead of the table cloth. It looks extremely ugly but it saves the work of cleaning the mess that is inevitably going to hit the table. And after all, the company of your friends matters more than some fancy table cloths. Your Chinese friends will probably offer you a paper tissue when they reach to get one for themselves. Tip: bring 2 packages if you're going to a spicy restaurant.

Chinese tea is an interesting topic. Don't think that everybody around here is an expert on tea and can explain the difference between various Oolong teas at length. I seem to be actually more knowledgeable in this topic than a typical local person. On the other hand, 7-11 convenience stores all have plastic bottles of cold tea, with or without sugar. Not Nestea but rather actual tea. And Chinese style restaurants serve tea as a basic free service. But even then, HK style milk tea and HK style lemon tea are still the most common drinks to consume with your meal.

Some local restaurants offer "Western food". That almost always means one kind of tomato-based soup, offered without fail in the same form by all of them. Apparently we Westerners only know one type of soup. Furthermore, if you order potatoes as the side dish, it's almost always going to be 1 small potato which is clearly not enough carbs to get me through the day. I don't understand it, the rice portions are usually pretty big and potatoes are not even expensive.

Language

The native language of Hong Kong is Cantonese. This is actually the language of the entire southern Chinese region but immediately after you cross the border of Hong Kong to Shenzhen, everything switches to Mandarin. If you keep going North, you'll get back to Cantonese. This anomaly is caused by the numerous immigrants to the industrial megacity of Shenzhen. And this is probably not going to last forever because the Chinese government is actively trying to eradicate Cantonese so that they have a more homogeneous population that is easier to control.

As for English, it is important to HKers to learn English but many still struggle and well written English is hard to come by. On the other hand, if you compare with some European countries where people don't even bother, you have to give HKers some credit for trying.

Often you can spot an English sentence written by a Chinese person not only from the errors but also from the style. Overuse of strong adjectives is very common so typically you can "win fabulous prizes" which are actually just a branded pen and chocolate or download "breathtaking games" such as Pacman or Pong. If you buy a cheap electronic product, you can be already pretty sure it was made in China but for the sake of argument let's say you'd use the language on the box to guess the product's origin. Phrases about "enjoying your life", "enjoying every tap on the device" or "experiencing fabulous digital life" will give you a hint.

Freezing air-con 24/7

This one simply can't go unmentioned. The mystery of air-cons everywhere set to ~~kill~~ freeze remains unsolved. Locals, when interrogated, dodge the topic or remain silent. After 3 years here, though, it seems that the culture here dictates that you need to have fresh air flow at all times, otherwise you die. Considering the high humidity in this region I admit this is certainly true to an extent. But locals take it to the extreme and consider even 10 minutes without air-con a threat. Using the fan-only mode is not acceptable either, even in winter: if air is not cooled, it simply cannot be fresh. I wonder when a HKer and a Korean have to sleep in the same room: HKer will die if the fan or air-con is off, the Korean will die when the fan is left on!

What they don't have here

Going to lunch with colleagues and want to split the bill? Bad luck, waiters will usually not do that for you. Have fun giving back change that you don't have. I foresee cryptocurrency payments to be the only way out of this situation ;) wink wink

Their supermarkets are not air-tight like in Czech. That must mean that people usually wouldn't steal from a supermarket here. I cannot imagine such degree of trust in Czech and it makes me sad.

Honestly, recycling and environment conservation both appear to be rather alien ideas around here. Restaurants overflow with piles of take-away boxes, you get a plastic bag for everything, vegetables and fruit in supermarkets is already pre-wrapped in plastic, sometimes in 2 layers! Also, it's really funny to never, ever, see squished plastic bottles in recycling collection points. It could save a lot of space and almost everyone in Czech does it. Here, the idea never appeared. Makes me wonder what other useful ideas are completely missing in some parts of the world.

Everybody in cold countries knows how to walk on snow or ice. You just need to move your weight exactly over your feet before relying on that foot. Children also know how to slide on ice and can go all the way to school just sliding on the icy pavement. In HK on the other hand, even a little wet tiled floor is a serious threat. Warning triangles are deployed, floor driers are set in operation. I know it's to protect building management from being sued but it's just ridiculous. There is ice in HK only once per 35 years and when it comes, it's a little embarrassing:

[1] Fixed incorrect claim, thx Alessio

Thursday, February 18, 2016

The fall of Couchsurfing and the need for DApps

Abstract: How Ethereum DApps can be applied even outside finances and how communities can benefit from technologies of the future.

Couchsurfing (or at least the idea of it) used to be a community of people who would welcome each other in various places around the world, show them the local culture, recommend best local places and since many people have a spare couch at home, why not let the traveller crash there for a night or two.

Of course, not everybody knows everyone so people would leave references for each other after having spent some time together. The reference would include information such as how long you've known the person, whether your experience of them was positive or negative and of course a paragraph or two. The site emphasized that references are the fundamental tool to keep people secure.

People have accumulated lots of positive references over the years by letting backpacking travelers crash on their couch and showing them around. People knew they could trust a bunch of good references. This good track record would make it easier to find a couch when they go traveling (which for a typical CSer is often). So we could say that having lots of good references on CS has some value and is not that easy to build.

Of course, nothing lasts forever and CS, no longer a non-profit, is moving away from the original idea. The home page https://www.couchsurfing.com is saying something about staying somewhere for free. No mention of cultural exchange, making new friends. The last step they've taken is removing metadata from references, leaving only the verbal description. So you can no longer see at a glance if it's positive or negative. Want to know if that stranger is trustworthy enough to let them stay? Sure, just read through all of their references!

The reason CS is doing this is because they are now owned by the same group which owns AirBnB and other paid accommodation services. "Free accommodation" using couchsurfing is not a good alternative to their paid services so they need to get rid of it. And delete years worth of good references.

We could say the problem is in the amorality of new owners of CS destroying the community and something that the people have been building. Maybe. The CS website owns all the data that users have entered there and it has control over what shows on the website. They are able to do any move that's bad for the community and the users are powerless even though they create the site's content and actually the entire value of the website.

It doesn't have to be like that. ~~Scientists are working~~ Computer nerds are working on a new Web, one where users are in control because they own the site and its data collectively. No longer having to trust one person or company that could become evil or simply sell out to some greedy profit-seeker. In this way, changes that wouldn't benefit the community could not be made, data could not be deleted.

Those websites are called DApps (from distributed apps, because they are owned by multiple people) and they are slowly becoming possible thanks to technologies such as Ethereum, Bitcoin, IPFS and others. They can be used to build financial services, true democratic communities or what I've described above. Sounds interesting? Get involved! Even if you're not technical at all, an ordinary user can help a lot in this early stage. Install the apps. Start playing around, get involved in discussion, many ideas still need figuring out. Tell other people about the idea.

And what is the fate of Couchsurfing? Multiple people are thinking about starting a new site. Starting from scratch, without the data that has been built in CS because there's no way to transfer the data to a new place (another problem of the old website system). Building a new CS as an DApp would not be easy at this moment because there's very little support available for DApp developers, it is a very unexplored area and needs original work to build anything, unlike traditional websites. But that will change over time.