Final

Well, I guess that I should've written this post earlier :D But it's better late than never. A whole month passed since the end of GSoC and even more since the last post. Long story short, I managed to pass GSoC and got something working. But I didn't have enough time to write the final post since I've been too busy with moving to Finland and getting familiar with the university life here (I am studying abroad this semester). But now I finally have some spare time and I'd like to tell about results, problems during development and solutions to them. But most importantly, I'd like to tell about future plans and integration.

Results

I think that I managed to raise the project to the stage when it can be played with. In the main repository there's an example configured. In order to change debugged project one just needs to change name of the project in the main file of the example and build system will do all the rest (it's that easy). A lot of examples to debug are available in the gdb-examples repository (including multithreaded projects and projects with multiple processes and dynamic loading of object files). Seems like it's actually working! And I believe that it's possible to integrate out of curiosity, but since I didn't have enough time to finish the UI (styles, etc.) it's probably not so appealing. Also, I wrote a pretty comprehensive documentation for gdb-js and a smaller one for react-gdb (it just doesn't need a that big documentation). Anyway, here's my evaluation report (it's concise to say the least, heh).

Problems and solutions

Now, finally, there's something that's interesting. A long time has passed, so I probably forgot most of the problems. But that's for the better, since only the most interesting ones remain (I hope).

DockerHub

There's a problem with tests that depend on the docker image. Whenever you push new build to the DockerHub it's likely that your CI tests for the dependent project will break (even without pushing new commits!). That's because you always checkout the latest tag and it's not possible to checkout specific build. So, the only solution to this problem is to mark every release of your Docker image and checkout this specific version. It worked for me at least. Though, it's still needed to transfer the DockerHub repository of gdb-examples under the @taskcluster organization and set up automated builds.

Context

I learned and remembered a lot about C compilation process during this project. And at last I fixed and improved the context command that now returns all global and local symbols in the context (static variables, externs, arguments) with types.

Classes

JavaScript is a dynamic-typed language and thus adding new abstractions through creating new classes doesn't really help and usually complicates the code rather than the opposite. But in this case adding new object types for threads, breakpoints, etc. helped to simplify the API. It's important to remember though, that they are just wrappers around the static information.

Scopes

I extended the idea of execution of methods within threads to thread groups. Now most of the methods accept scope parameter which can be either thread object or thread-group object or nothing. This really helped to support debugging multiple processes.

Source files

There were multiple problems with source files. First, the list of source files changes (e.g. new process is attached or new executable loader with execl call, etc.). I decided that the most reasonable approach is to update the global list of source files anytime new object file is loaded into GDB. But in order to avoid unnecessary updates I added a filter for objfiles so that only those files that match the regexp will trigger the update (there's a lot of shared libraries, you know! :)). Second, the command that returns source files list makes it only for the current thread group (i.e. target). Thus, I needed to manually switch thread groups in order to get their source files lists (scopes really helped there). Third, some projects have enormous amount of source files. It's reasonable to filter them with a regexp. It's possible to do with a custom Python command, but due to GDB/MI bug you can't really get results of the MI commands in the Python. It's possible to use only Python and CLI API. Python API doesn't know anything about source files. So, I made use of CLI API which involved a couple of hacks to make it more reliable (of course, I'm still using Python for this).

Attaching processes

There were multiple issues too. First, to attach a process you need root privileges. Luckily, Docker has them by default. Second, you also need to run Docker container with a --security-opt seccomp:unconfined flag. And third, you need to manually create a new inferior and load the process there. Otherwise it'll override the current target. gdb-js now do it all automatically.

Commands results

I almost completely changed the logic of handling the results of CLI and custom commands execution. Before that, I've been executing custom commands as CLI commands in the Python API and returning the string as a result. But the problem here is that during the background execution of CLI command in the Python API a lot of asynchronous things may happen and their output will be also printed to the resulting string (not sure if it's a bug or convenience, heh). So it's not possible to predict what will return the custom command. But there's a way to make custom commands more reliable — move the logic of framing results to base class of a custom command (that's exactly what I did). Also, I added possibility to create custom asynchronous event handlers and subscribe to them in the same manner. It helped to make use of new-objfile event.

Synchronization

async and await are great, but it's very possible to shoot yourself in the foot and don't notice it if you're not paying enough attention to how event-loop works. Especially, when two async methods depend on some shared data or state, calling them simultaneously may produce different results (i.e. methods are unstable). This is very relevant to me since some methods of gdb-js depend on the internal GDB state. It would've be okay in simple case, but I use it inside Redux application where actions don't wait other actions to complete. This leads to using of mutexes and/or semaphores. Sounds weird, huh? JavaScript is not multithreaded and that's why everyone on the Internet is saying that mutexes are not needed in there. But I faced this problem before and I know that it's not true. So, I googled better and found this article with which I agree completely. Though, I went the easier way: I breaked down methods to public and private and wrapped public methods with a synchronize routine. Now no public method is executed before previous are completed. By the way, this is a good use case for ES7 decorators :)

Dealing with GDB/MI bugs

What else to expect there :D Of course, there were a lot of bugs. I fixed GDB/MI parser, because I found further mistakes in the grammar. Also, there was a bug with background execution of -exec-run, I switched to using run& CLI command to handle this. Anyway, what I wanted to say is that bugzilla happened to be a very good helper in resolving GDB specific bugs.

react-gdb API

And yeah, I changed react-gdb API a little. README is pretty comprehensive, so I don't think that it's necessary to elaborate it here. Just wanted to say that detachOnFork is meant for debugging purposes, it's recommended to use inferiorProvider instead since it's crossplatform feature.

Ahead-of-Time breakpoints

At this moment the main source of project source files is GDB. This works fine, but it contains only files that were used in compilation. If we want more IDE like environment we need to fetch the list from repository. But what's more important, user will probably want to set breakpoints to files that are not yet loaded to GDB (e.g. when debugging multiple targets), so that makes sense to make use of external source list. Whoa, seems like a complete refactoring! But actually it's not. It's just an another layer of abstraction. I don't want to dive into details, but I'd like to say that mapper functions will be needed (those who map repository file to a file in file system and vice versa).

Lessons learned

JavaScript is not scalable

I've always liked JS for its flexibility that helps in prototyping. However, it's not scalable and due to its dynamic nature it's hard to refactor. This project was a little bit untypical and I needed to refactor code a couple of times. This is to say the least pain in the ass. Lots of runtime errors, "undefined is not a function" and so on. I even considered to use TypeScript, but typing is not the only problem of JS. In a medium to large sized projects it's beneficial to use functional style to do things since such code is easier to maintain. It's possible in JS with tons of libs like Immutable.js, React is also functional to the bones. But it really starts to annoy quickly because of too much boilerplate and tons of external tools. Language itself doesn't help at all! I remember using F# for UI application and I really liked this experience. It's not so verbose as Java and C#, but still have all pros of a statically-typed language. Even more I liked Haskell for its brutal purity. So, I decided to find a good alternative to JS that compiles to JS. I spent a lot of time on that, because there are lot of choices: TypeScript, ClojureScript, ScalaJS, GHCJS, PureScript, Elm, etc. Let's put it in brief: ClojureScript is dynamic, TypeScript is not functional, ScalaJS and GHCJS have huge runtimes and Elm is limited. Whereas PureScript is the most pragmatic solution I've ever seen, seriously. Basically, it's a Haskell with JS semantics (without laziness by default). It's very easy to embed into any JS application (not harder than to start using CoffeeScript!). It has a very simple FFI, it's possible to use on the server and what's more important there's no need to give up on the familiar tools and whole Node.js ecosystem. So, for now I'll stick with it.

TDD is not an answer

I tried to practice TDD in the project. But I soon noticed that blindly following to TDD best practices doesn't make any good. Then I happened upon the "Is TDD dead?" series, watched it and learned some lessons. I understood that mocks should be avoided if possible. If you are writing tests returning mocks returning mocks returning mocks then your tests depend on the exact implementation and it makes the code impossible to refactor. Only public interface should be tested, but when internal logic starts to grow too fast, it's time to think about moving this logic to separate useful and reusable module and test its public interface instead. In this sense it's a good thing that I have separate modules for gdb-js and react-gdb.

Plans

Yeah, I know that speaking about plans after one month of inactivity is a little bit bold :) But I hope I'll have some spare time to contribute even in conditions when I have to pass exams in two universities in one semester, heh. Anyway, the only remaining thing to implement is pretty UI, it's not a lot. Usually I spend some time on creating good UI, but this time I didn't have enough time :( So, first I'd like to quickly watch a Google's course about responsive design and then try to apply it to the project. Probably, I'll use react-bootstrap together with CSSModules (which is already in use). Also it's needed to support custom themes. I don't have enough scopes to integrate it with taskcluster and real test cases from treeherder, but I'm sure that when styles are ready, someone will want to integrate it! :)

P.S.

This is the last post in this blog. From now, everything related to the project will be posted in the GitHub repository. It was a nice and really useful time for me, but I hope that this project will be useful for someone as well. Good bye! :)