Little Annoyances, Big Decisions
Little annoyances, big decisions
A story
I think software engineers, myself included, are influenced heavily by annoyances, and those annoyances drive major decisions in our industry far more than we'd like to admit.
I worked with someone, a much more senior engineer than me at the time, who strongly dislikes Java and strongly prefers C++. I'll call this person Bjarne to remind you that they love C++ (Bjarne Stroustrup created C++).
Years ago Bjarne got into a dispute with another engineer over the design of a new system, which I'll call the Accounting Gateway.[1] I'll call this other engineer James (after James Gosling, who created Java), because he wanted to write the new system in Java, while Bjarne wanted C++. This discussion was possible because the Accounting Gateway communicated via a REST API with other systems, most written in Java.
Well, Bjarne lost the battle and the Accounting Gateway was written in Java. It had some problems—for example, it had poor performance, and James went a bit overboard on the fancy Java features—but after a few years the system had matured, and these issues were largely fixed. James left the company, but Bjarne was still around. A new team took over the Accounting Gateway, and actively maintained and improved it.
Later, Bjarne started a new project. Its feature set overlapped largely with the Accounting Gateway, so I'll call it the Auditing Gateway. In fact, the Accounting Gateway was designed in such a way that its internal components could be largely reused for the Auditing Gateway, albeit with some minor refactoring.
But Bjarne decided the Auditing Gateway should be written in C++. This is the point I want to scrutinize. Reuse seems like the obvious choice to me, and modulo exposing the internal components of the Accounting Gateway as microservices themselves (which would take a lot of work), reuse means Java. Reuse would also leverage years of engineering, lessons learned, production-hardened code, and requires less effort to build. Quick and safe reuse is the primary benefit of good software design. Why waste that?
I suspect that Bjarne's opinion of Java was a significant factor. But when I asked Bjarne, I never got more concrete answers than, "I don't like Java," and falsehoods like, "C++ is much more performant than Java." What happened in the end? Amazon bought the company and decided to rewrite everything anyway to fit in with Amazon's systems.
How to hate a language less
I wouldn't say I love Java, but my usage of it eclipses all other languages in the last four years. I had used a lot of Java in the mid 2000's as a student, but since then much had changed, especially in how Google does Java.
I was already familiar with the concept of many of these practices, such as heavily favoring immutable collection types, and the analogues of Java streams in other languages. Dependency injection [2] was new to me, but that's too big of a topic for this newsletter.
Instead, I want to show some incredibly mundane things about Java (and my particular workflow) that have drastically improved my feelings about working with Java. That is, they're solutions to stupid annoyances, so dumb that most people don't bother to think about them, but without which, I would be incentivized to make poorer decisions that build up to hating everything, and maybe giving up on Java entirely like Bjarne.
I know because I actually started "hating everything" (almost) about the Java I worked on, but it was a vague cloud of contempt. After repeatedly hitting roadblocks, reflecting on the causes, and coming up with better ways, I reduced the contempt to the point that now (looking at other people's C++ and Python code), I prefer Java. Though I like to think that, through the same process in any other language, I would develop similar ways to cope.
In the end, this showcases how to invest in your tools to alleviate pain points. It's a force multiplier, and hopefully you won't find yourself throwing your hard work away in the hopes that a different language will save you. Chances are good it won't.
AutoValue
People who hate Java
often complain about the excessive amount of boilerplate code.
I largely agree.
Everything is in a class,
and if you write a class to hold some data,
then you think about equals
and hashCode
and if they don't agree
then collections of your objects (like Set
or Map
)
start to behave strangely and stuff breaks.
Python gets around this with dataclass
(or namedtuple
, its backwater cousin),
though I don't know many other people besides me
who use dataclasses heavily.
I didn't used to use dataclasses,
until I discovered the Java library
AutoValue,
and realized how much better it made Java.
AutoValue is basically a compiler preprocessor that auto-generates
dataclasses (with equals
and hashCode
)
from a succinct(er) syntax,
using some trickery with class inheritance
to make the succincter part you write correct Java code.
An AutoValue looks like this
@AutoValue
abstract class Animal {
abstract String name();
abstract int numberOfLegs();
static Animal create(String name, int numberOfLegs) {
return new AutoValue_Animal(name, numberOfLegs);
}
}
And it includes a bunch of variants that allow you to have AutoValues with builder syntax, to use factories, and to sensibly handle types like Optional.
A typical AutoValue I write these days looks a bit more complex, like
import com.google.auto.value.AutoValue;
/** javadoc */
@AutoValue
public abstract class Animal {
public abstract String synthesizedField();
/** Static factory method for {@link Factory.Builder}. */
public static Factory.Builder builder() {
return new AutoValue_Animal_Factory.Builder();
}
/** Factory for {@link Animal} */
@AutoValue
public abstract static class Factory {
public abstract String name();
public abstract int numberOfLegs();
private Animal build() {
// this would normally be lots of complex logic
String synthesizedField = String.format("%s, %s", name(), numberOfLegs());
return new AutoValue_Animal(mySynthesizedField);
}
/** Builder for {@link Factory}. */
@AutoValue.Builder
public abstract static class Builder {
public abstract Factory autoBuild();
public abstract Builder setName(String name);
public abstract Builder setNumberOfLegs(int numberOfLegs);
public final Animal build() {
return autoBuild().build();
}
}
}
}
Still looks verbose? It is,
but to avoid the tedium of typing this out every time
(I make dozens of these)
I created a snippet
and so I just type the following
and hit <C-j>
,
and then fill in the "synthesized" parts manually.
Here's the definition of the snippet.
autovaluefactory Animal String name int numberOfLegs
With a similar one for a "plain" AutoValue.
The benefit of the Factory is to hide the setup logic
internally, and the benefit of the builder is to allow
one to add and remove new builder pieces (or set defaults for some fields)
without method overloads or forcing all clients to update a complicated argument list,
as would be required with create(String, int, ...)
.
Some additional benefits of this are in reducing test-code boilerplate. I often have AutoValues with factories that have ten or more fields, and so in a unit test I can set up the default data in a helper, like so
class AnimalTest {
private Animal.Factory.Builder defaultBuilder() {
return Animal.builder()
.setName("Babe")
.setNumberOfLegs(4);
}
@Test
public void myTest() {
Animal actual = defaultBuilder().setNumberOfLegs(3).build();
asserThat(actual.getSynthesizedField()).isEqualTo("blah");
}
}
Each test only sets the subset of fields that matter for that test's assertion. Before I started using this style of AutoValue, I resisted changing code, in part because the amount of work required to update the tests far exceeded the amount of work to update the business logic. It also had the side effect of me wanting to write fewer tests, because I knew it incurred such a large maintenance burden. AutoValues made writing (and reading!) tests much easier. It's not an obvious benefit at first, but a very meaningful one.
(Functional) Interfaces
Generally, it's a good idea to design software in terms of interfaces and guarantees provided by those interfaces. This allows implementation details to be changed while reducing the likelihood of breaking client code.
Java embraces this ideology,
and I particularly like it for faking in tests.
Say the factory for the Animal
class above
requires an instance of another class AnimalKingdomHierarchy
,
which also has complicated setup.
In the test, your tests must instantiate
the AnimalKingdomHierarchy
every time you create an Animal
.
Many people get around this by using mocks,[3]
which introduces a whole host of other issues
like, for example, the fact that poorly-setup mocks
throw null-pointer errors at runtime.
And mock setup is often as cumbersome as setting up real objects.
However if AnimalKingdomHierarchy
is an interface,
then you can fake the interface,
meaning create an implementation that has real code backing it,
but is otherwise a simplified version of the real instance.
For AnimalKingdomHierarchy
it might look like this
public interface AnimalKingdomHierarchy {
public Order getOrder(int numberOfLegs);
}
And in the test you could fake it like
class AnimalTest {
private Animal.Factory.Builder defaultBuilder() { ... }
@Test
public void myTest() {
AnimalKingdomHierarchy hierarchy = new AnimalKingdomHierarchy() {
@Override
public Order getOrder(int numberOfLegs) {
return numberOfLegs == 8 ? Order.create("Araneae") : Order.create("unknown");
}
};
Animal actual =
defaultBuilder()
.setName("Babe")
.setNumberOfLegs(8)
.setAnimalKingdomHierarchy(hierarchy)
.build();
asserThat(actual.getSynthesizedField()).isEqualTo("Spider Pig");
}
}
Most casual Java users are surprised when they realize that you can create an ad hoc implementation of an interface in-line. It seems people are turned off from interfaces because they think that every implementation needs to be in its own class, and dealing with lots of files is another kind of annoying boilerplate.
Well, even that 6 lines of in-line instantiation is annoying,
since only one line of it is actual logic.
For single-method interfaces,
Java makes it even simpler
with the annotation @FunctionalInterface
,
paired with anonymous function support.
First, you annotate your interface (which must have one method).
@FunctionalInterface
public interface AnimalKingdomHierarchy {
public Order getOrder(int numberOfLegs);
}
Then, the compiler automatically casts any object
that has a single method matching the type signature of getOrder
to an instance of AnimalKingdomHierarchy
whenever needed.
The test code could be rewritten as
class AnimalTest {
private Animal.Factory.Builder defaultBuilder() { ... }
@Test
public void myTest() {
// This can be inlined in the builder, but I made it a variable here to
// emphasize that it compiles to this type.
AnimalKingdomHierarchy hierarchy =
legs -> legs == 8 ? Order.create("Araneae") : Order.create("unknown");
Animal actual =
defaultBuilder()
.setName("Babe")
.setNumberOfLegs(8)
.setAnimalKingdomHierarchy(hierarchy)
.build();
asserThat(actual.getSynthesizedField()).isEqualTo("Spider Pig");
}
}
This also works seamlessly with dependency injection.
The process of changing
which instance of an interface is injected
requires only changing the line that @Binds
the interface type to the implementation type.
Because of these conveniences, I've embraced functional interfaces.
Suppose there is a calculation
of ownership of an Animal
.
It starts out simple:
an if statement using only other fields of Animal
.
A hypothetical programmer might add it as a method to Animal
,
and client code would call animal.owner()
.
Instead, I would have the client code
depend on a functional interface AnimalOwnerFn
,
whose method has the type signature Animal -> Owner
.
Invariably, I expect ownership to get more complex.
It may start to depend on other types,
like, say, an OwnershipHistory
and a PoundLog
.
With the object-oriented style,
these might be added as new fields on Animal
,
which at best requires plumbing
and likely updating tests for Animal
.
At worst it requires extra computation,
because what subset of an entire Pound's log
"belongs" on a specific animal instance?
Surely, not the entire pound log...
With FunctionalInterface
—which
might be called the "functional style",
though that term has baggage
I don't mean to include here—you make
a dedicated interface,
pass the interface to clients,
and the implementation details become less jarring
when they change.
Clients are forced to only depend on
the bit of logic they actually need,
and not the entire package of data in Animal
if they don't need it.
And it allows you to keep data classes as only holding data,
rather than mixing in business policy.
@AutoValue
class RabiesCheckImpl implements RabiesCheckFn {
...
public abstract AnimalOwnershipFn ownershipFn();
public abstract HasRabiesFn hasRabiesFn();
public abstract VaccineSchedulerFn vaccineSchedulerFn();
public void doStuff(Animal animal) {
Owner owner = ownershipFn().getOwner(animal);
if (hasRabiesFn().hasRabies(animal)) {
AppointmentRecord record = vaccineSchedulerFn.scheduleVaccine(owner);
}
...
}
}
This is the distinction that has made my work much easier: anything that involves business logic goes into a functional interface, or, if not possible, a multi-method interface. Then clients only depend on the interfaces, even if one class implements multiple interfaces. And when implementations get reorganized, or pieces get reused, the interfaces need not change.
And that's the sort of idea that can transfer
to another language,
whereas FunctionalInterface
is just how Java makes it palatable.
-
This was the name of the first project I was ever paid to work on, and I always think of the grandiose name and laugh, because it was plain, boring, accounting logic. ↩
-
I linked to the Guice library, which is by far the most widespread dependency injection framework for Java, and ubiquitous at Google. However, I prefer Dagger because it checks the dependency graph structure at compile time instead of runtime. I happily use both. ↩
-
I actually hate mocks, and I insist they are only ever used for things that cannot be reasonably faked. ↩