It's important to note that having high test coverage doesn't make code good. Un...

benjiweber · on May 30, 2019

I see this opinion a lot from people who haven't seen tests and code written by people experienced with TDD. The tests should not end up that coupled to the code. The implementation structure and the test structure end up somewhat different when refactoring every time the tests are green. When listening to the feedback from the tests & code. With the skills to spot the refactoring opportunities.

Oftentimes people seem to equate unit testing with a 1:1 correspondence of test and implementation with high coupling between the two. These sort of tests resist refactoring, rather than enabling it. With good tests you can pivot the implementation and tests independently.

Recommend https://www.youtube.com/watch?v=EZ05e7EMOLM and https://vimeo.com/83960706 on TDD

ishjoh · on May 30, 2019

In my experience, your statement is true when writing library code or tests that don't need to mock lots of objects.

Unfortunately, Unit testing becomes highly coupled when testing classes in the standard web architecture. A service class you're testing can depend on other service classes, a DAO, and potentially other web services, so now you're left mocking all those other classes if you want to create a Unit test instead of an integration test. Since the external dependencies have been mocked out, now the Unit test is higly coupled to the implementation and is a PITA to change the implmentation of the test or the code implementation. I suspect that's why OP prefers integration testing, as it helps keep the test less coupled from the implementation.

Boxxed · on May 30, 2019

In my experience, if your tests require lots of mocks then that's a sign that IO is coupled too tightly to application logic. Refactoring your code so this isn't the case isn't always obvious, but it's a breath of fresh air and really cleans up the interfaces.

brianpgordon · on May 30, 2019

One problem with decoupling IO is that you still somehow need to get the data deep down into those places where it's needed by your application logic. That means you end up either:

1. Passing each individual little piece of data separately down the call stack with bloated method signatures containing laundry lists of data that seemingly have nothing to do with some of the contexts where they appear.

2. Combining pieces of data into larger state-holding types which you pass down the call stack, adding complexity to tests which now need mocks.

I think one of the toughest parts of day-to-day software engineering is dealing with this tension when you have complex modules that need to pass a lot of state around. It's easier and cleaner to pull stuff out of global state or thread contexts or IO, but that makes it harder to test. More often than I would like to admit, I ask myself whether a small change really needs an automated test, because those shiny tests that we adore so much sometimes complicate the real application code a lot.

If anyone has thoughts on how they approach this problem (which don't contain the words "dynamic scoping" :P) I'd love to read them.

BeetleB · on May 30, 2019

This is my experience as well. I learned the lesson the one time I was allowed to write unit tests at work. It was on an existing code base without tests. I had to significantly refactor code to make it testable, and one of the lessons I learned from the experience is to isolate I/O from the main business logic that I'm testing.

In the pre-test code, the functions were littered with PrintConsole statements that would take a string and a warning level (the Console was an object that was responsible for printing strings on a HW console). I made sure my main business logic was never aware of the Console object. I made an intermediate/interface class that handled all I/O, and mocked that class. Instead, the function now had LogMessage, LogWarning, LogError functions of the interface class that took a string. The function had no idea where these messages could go - it could go to the console, it could be logged to a file, it could be sent as a text message. It didn't care.

Now when we needed to make changes to how things were printed, none of our business logic functions, nor their tests, were impacted. In this case at least, attempting to unit test led to less coupled code.

Scarblac · on May 31, 2019

What if most applications are mostly IO and have little application logic? Business applications are fancy looking CRUD a lot of the time.

davidjnelson · on May 30, 2019

That’s a good insight. It applies to side effects in general, for instance setState in react.

carlmr · on May 30, 2019

And usually with good tdd acceptance in your team people automatically write more testable code, because they're too lazy to write tightly coupled code that needs many mocks.

arwhatever · on May 30, 2019

... and no doubt the ratio of application/domain/pure logic to external services interaction varies tremendously by project and by industry, which is likely what leads to such a variety of opinions on the subject.

benjiweber · on May 30, 2019

I would consider needing to mock a lot of objects to write your test a form of design feedback. An indication that our design could be improved. Perhaps the code under test has too many responsibilities, we're missing an abstraction, boundaries are in the wrong place, too many side effects.

One of the downsides of modern mocking frameworks being so easy to use is that it's less obvious when we're doing too much of it.

If we test drive the behaviour, our first failing test of a single behaviour won't involve many collaborators. If it does we're probably trying to test more than one thing at once. At some point as we add tests we may add more collaborators. If we refactor at each time we should be asking ourselves what's going wrong.

Testing more than one class at the same time doesn't make it an integration test. Arbitrarily restricting a unit to map to a single method or a class is a good way to ensure that your test code is tightly coupled to the implementation.

Scarblac · on May 31, 2019

> Testing more than one class at the same time doesn't make it an integration test. Arbitrarily restricting a unit to map to a single method or a class is a good way to ensure that your test code is tightly coupled to the implementation.

But at least if you restrict your units to a single method, you have a chance of getting somewhat complete tests. If you're testing multiple classes with several methods each as a unit, the number of possible code paths is so huge that you know you cannot possibly test more than a small part of the possibilities.

benjiweber · on May 31, 2019

This doesn't have to be the case.

If you TDD your implementation then it's all covered by tests. If you refactor as part of the TDD process then you may factor out other classes and methods from the implementation. These are still covered by the same tests but don't have their own microtests.

Aeolun · on May 31, 2019

If you cannot write a simple test for your code, it is a good indication that you need to change the code, not the test.

cryptica · on May 31, 2019

The video seems to support all my points. "Adding a new class is not the trigger for writing tests. The trigger is implementing a requirement."

A test which covers a class is a unit test. A requirement is typically a feature. To test a feature, you usually need integration tests because a feature usually involves multiple classes.

jasode · on May 30, 2019

>Tests have nothing to do with code quality.

I didn't downvote your comment but I vehemently disagree. Mission-critical code such as NASA flight guidance, avionics, and low-level libraries like SQLite depend on a suite of tests to maintain software quality. (I wrote a previous comment on this.[0])

We also want the new software that commands self-driving cars to have thousands of tests that cover as many scenarios as possible. I don't have inside knowledge of either Waymo or Tesla but it seems like common sense to assume those software programmers rely on a massive suite of unit tests to stress test their cars' decision algorithms. One can't write software with that level complexity that has life-&-death consequences without relying on numerous tests at all layers of the stack. Yes, the cars will still have bugs and will sometimes make the wrong decision but their software would be worse without the tests.

High quality software relies on both lower-level unit tests and higher-level integration tests. Or put another way, both "black box" and "white box" testing strategies are used.

[0] https://news.ycombinator.com/item?id=15592392

oldmanhorton · on May 30, 2019

Isn't this disagreement basically the same point made by Martin about different kinds of quality? SQLites tests don't say the code is architected well and reusable and modular and blah blah blah, it says that it works. When people talk about the quality of NASA code or SQLite, that feels more like external quality than internal quality.

SQLite · on May 30, 2019

The 100% MC/DC testing in SQLite does not force the code to be well-architected, but it does help us to improve the architecture.

(1) The 100% branch test coverage requirement forces us to remove unreachable code, or else convert that code into assert() statements, thereby helping to remove cruft.

(2) High test coverage gives us freedom to refactor the code aggressively without fear of breaking things.

So, if your developers are passionate about long term maintainability, then having 100% MC/DC testing is a big big win. But if your developers are not interested in maintainability, then forcing a 100% MC/DC requirement on them does not help and could make things worse.

jasode · on May 30, 2019

>Isn't this disagreement basically the same point made by Martin about different kinds of quality?

M Fowler's comment about "tests" was also made in the context of internal quality. He mentions "cruft" as the buildup of bad internal code that the customer can't see:

>[...] the best teams both create much less cruft but also remove enough of the cruft they do create that they can continue to add features quickly. They spend time creating _automated tests_ so that they can surface problems quickly and spend less time removing bugs.

commandlinefan · on May 30, 2019

I think what he means is that just because you have tests (and even if you have high code coverage) doesn't meant that your code is high quality. They're correlated, but I've actually seen code with high test coverage... whose tests never made any assertions.

jasode · on May 30, 2019

>tests [...] doesn't meant that your code is high quality. They're correlated,

Yes, if they're correlated, that contradicts the absolutist statement of "Tests have nothing to do with code quality."

Trying to improve code correctness is directly affecting code quality.

streetcat1 · on May 30, 2019

So self driving systems are based on machine learning, and thus does not have regular (deterministic) unit tests. They will mainly be tested on past data, but the end results are always probabilistic. I.e. no one (not even musk himself) know how the car would behave when it see something that it was not trained on.

copperx · on May 31, 2019

I've always wondered whether there's a bunch of conditional statements constraining the output of the probabilistic model. That would seem like the logical thing to do, but I'm not that familiar with ML to know whether such thing is needed or not.

cryptica · on May 31, 2019

I think that unit tests make sense for safety-critical systems but still in those cases, my point would be that it's better to add them near the end of the project once the code has settled.

a-priori · on May 30, 2019

Skilled carpenters use hammers. That doesn't mean a hammer can't cause a lot of damage if used incorrectly.

msluyter · on May 30, 2019

Re: your last point, I recently rewrote parts of a billing system full of hairy logic and edge cases (and bugs). The initial MVP consisted of exactly replicating the existing invoicing logic. Due to the general complexity of the problem domain, I found myself rethinking and rewriting large portions of the system as I grew more familiar with the (undocumented, naturally) business requirements. In some cases I'd throw out entire modules and associated unit tests. After a while, I started relying more on integration tests which simply compared generated invoices between the two systems (and/or against golden files.)

Having these made it extremely easy to refactor large portions of the system quickly without needing to refactor unit tests. (I still wrote unit tests, just less of them, more focused on the stabler parts of the system.) This has loosened the grip of the "every function must have a unit test" mantra in my mind, which... I dunno, somewhere along the way sort of became simply assumed.

Some caveats to note, however. A) The code had minimal external dependencies (postgres). B) The integration tests ran very quickly against a local postgres database, only slightly slower than unit tests might, providing a quick dev feedback loop. C) While internally, the system was rather complex, the output was not. It was a simple CSV file that's trivial to parse/compare.

Thus, I wouldn't overgeneralize from the above. In cases where there are lots of external dependencies, integration tests are slow, or where evaluating the test results is more tricky (ie, you need Selenium or whatnot), this approach wouldn't be as feasible.

humanrebar · on May 30, 2019

Most of this is a series of false choices. In fact

- tests can help show code quality improvements do not break anything

- you can have integration tests and unit tests at the same time; in fact, it is more of a spectrum than two rigid categories

- it's often possible to have simple code and test it

Generally speaking the more specific the question, the less controversial the choices are. It's typically not all that interesting to argue about how to test a particular algorithm, data structure, or service.

The hard part in all of this, from an engineering perspective, is just talking to folks, promoting good teamwork, actually showing the value of less obvious things (a passing test suite), and knowing what to do when technology choices become toxic to the product or team.

barrkel · on May 30, 2019

Most interesting refactorings change the boundaries and count of abstractions, which usually does break unit tests.

Unit tests are great at the leaves of the call graph, and things which are almost leaves because their dependencies aren't at any real risk of change. The further into the stack you go, the more brittle they get.

humanrebar · on May 30, 2019

The point is that I use all kinds of tests all the time. It works fine. You aren't required to join a tribe and debate about abstract test styles.

Look at the current problem and come up with good answers to the questions:

- How do we know it works?

- How will we know it still works in a year?

...you don't always need the best answers, even. Most projects should start with honest answers and work from there.

jacques_chester · on May 30, 2019

> It's important to note that having high test coverage doesn't make code good.

Sure, but low test coverage doesn't make it good either. Coverage is a metric and like any metric, it (1) needs to be assessed with judgement and (2) becomes useless when it's used as a target rather than a measurement.

> Tests have nothing to do with code quality. All they do is verify that the code works.

Well to start, Fowler notes a distinction between external and internal quality. External quality is "does it work from the end user perspective?", which can be verified by tests -- you note integration tests in this role (acceptance tests, feature tests, user tests, behaviour tests, whatever you choose to call them). In the external quality case, verifying that the code works is a large fraction of quality.

Your argument, I think, is that internal quality is unaffected by testing. I don't agree: in my experience the needs of simple testing create constant design pressure on the production code, most of which makes it easier to create future changes.

Though as noted at the top of the thread: expertise still matters. Writing better tests and better production code are skills.

Boxxed · on May 30, 2019

> I'm a big fan of integration tests though because they lock down the code based on high level features and not based on implementation details.

I've found this to be a dangerous mindset. Integration tests are great, but they need a solid foundation of unit tests. Integration tests are slow, difficult to root-cause, complex to write and maintain, and also generally don't test all the various corners of the system.

Testing is a pyramid, with unit tests at the bottom and integration tests somewhere in the middle. If your unit tests are based in implementation details, as you say, then that's probably a sign that a refactoring is in order (would love to be less blunt but it's tough with the absence of more details).

Rooster61 · on May 30, 2019

> Tests have nothing to do with code quality. All they do is verify that the code works. I would argue that the simpler and therefore the better your code is, the less you need to rely on tests to verify that it works. Fewer edge cases means fewer tests.

While I won't argue that tests verify that the code works, the assertion that tests have nothing to do with code quality based on that premise is incorrect, and here's why.

Some of the main types of poorly written code are 1) brittle code, which breaks easily when things are changed, such as dependency changes or changes in I/O, and 2) unreadable code, which decreases accurate understanding of what the code does and causes incorrect assumptions to be made, which yields bugs.

Unit tests, over time, raise the alarm to these types of code smells. While a test might not yield much info for a short time after it is written, when the code ages and has to stand up to the test of changing code/environment around it, well written tests WILL highlight parts of the code that can be considered poorly written due to the two criteria above.

geezerjay · on May 30, 2019

> Unit tests will actually make bad code even worse because it will be even more difficult to change the underlying logic (because the tests lock all the poor implementation details into place).

This statement is patently false, unless for some reason a project includes unit tests themselves as the production code, which would be highly unusual.

At most, unit tests must be refactored along with the code, but that's the standard operating procedure.

jen20 · on May 30, 2019

This seems to assume that the tests will be higher quality than the problematic code in the first place. It’s actually commonplace to see tests coupled to internal implementation details of production code, which makes refactoring very hard.

The idea of TDD (mostly lost to hype and consultants) is that you change _EITHER_ the tests or the code in each operation. This allows you to use one as a control against the other. If you change both, you prove that different tests pass against different code, which is substantially less useful. Unfortunately if tests are coupled to internal state, getting code to even compile without modifying both sides of the production/test boundary is difficult after a refactor.

geezerjay · on May 30, 2019

> This seems to assume that the tests will be higher quality than the problematic code in the first place.

If the problem lies with probematic code then tests are not the problem. At most they're just yet another way that problematic code affects the problem.

Let's put it this way: would the problems go away if the tests were ripped out?

rement · on May 30, 2019

I actually just finished a ticket related to this. It took me significantly longer than necessary because I also had to go through all the poorly written tests.

cryptica · on May 30, 2019

But while you're building a new system/subsystem, it doesn't make sense to write unit tests for units of code which have a high likelihood of being deleted 1 month from now due to evolving project requirements.

It's like if you were building a smartphone; it wouldn't make sense to screw all the internal components into place one by one unless you were sure that all the components would end up fitting perfectly together inside the case. While building the prototype, you may decide to move some components around, downsize some components, trim some wires and remove some low-priority components to make room for others. In this case, unit tests are like screws.

gnator · on May 30, 2019

The problem is prototypes end up being production code in the real world. Writing tests is about managing risk. You should write some basal level of unit test as you go to validate your logic as you go. That basal level is determined by the team or individual tolerance of risk

v-erne · on May 30, 2019

Who said anything about production ? Haven't You seen requirements changing even before first prototype is ready? I had a meeting literally today where client's CFO and head accountant throw out my team's week of work because they forgot about key requirements (and this have happened third time this year)

Buttons840 · on May 30, 2019

> Unit tests will actually make bad code even worse because it will be even more difficult to change the underlying logic

Objectively false, if not having tests is better than having tests then delete the tests. Instant improvement.

This fact leads to the conclusion that the value of having tests is greater than or equal to the value of not having tests, in all cases.

discreteevent · on May 30, 2019

Yes I have instantly improved code by removing micro level solpisistic tests that were tightly coupled to the implementation. These tests made it much slower to improve the quality of the code and had zero benifits because they only tested that the code did what it did and not what it is supposed to do.

TheCoelacanth · on May 30, 2019

You are missing the effect of loss-aversion and the cognitive bias towards keeping a test even if it adds negative value.

Once a test has been added, it will tend to stick around even if it is worse than no test.

Buttons840 · on May 30, 2019

Good point, and I agree. Sunk cost fallacy and all that. I wouldn't consider those "objective", but I don't think it's worth arguing over the definition of that word when I think we otherwise agree.

I also would agree that sometimes time has been wasted creating too many tests. Perhaps that time could have been spent to greater effect.

I also think that even if, in retrospec, a test is very tightly coupled and specific to one implementation, that test still might have revealed bugs and may have helped the original author. If that test is now a burden, throw it away.

Scarblac · on May 31, 2019

You appear to be trying some sort of reduction ad absurdum, but in many cases work on some change to the software starts with deleting all the related tests because they're going to be irrelevant and changing them isn't worth the extra work.

That that deletion is necessary means it apparently did make the code a bit worse.

Also all the time they were in while the code wasn't being changed, they made running the application's tests slower.