Characteristics of a Great Scrum Team

According to the Scrum Guide, Scrum is a framework within which people can address complex problems, and productively and creatively develop products of the highest possible value. It’s a tool organizations can use to increase their agility.

Within Scrum self-organizing, cross-functional, and highly productive teams do the work: creating valuable releasable product increments. Scrum offers a framework that catalyzes the teams learning through discovery, collaboration and experimentation.

A great Scrum Team consists of a Product Owner who maximizes value, a Scrum Master who enables continuous improvement and a Development Team who focus on delivering high quality product increments.

For sure this sounds great!

But what are the characteristics of such a great Scrum team? This white paper will answer that question. It offers a detailed description of the characteristics and skills of a great Product Owner, Scrum Master and Development Team.

The Product Owner

The Product Owner is responsible for maximizing the value of the product and the work of the Development Team. It’s a one-person role that brings the customer perspective of the product to a Scrum Team.

The Product Owner is responsible for:

  • Developing and maintaining a product vision and market strategy;
  • Product management;
  • Ordering and managing the Product Backlog;
  • Involving stakeholders and end-users in Product Backlog refinement and backlog management;
  • Alignment with other Product Owners when needed from an overall product, company or customer perspective.

A Great Product Owner…

  • Embraces, shares and socializes the product vision. A great Product Owner represents the customers voice and creates a product vision together with the stakeholders. Every decision is taken with the product vision in mind. This ensures sustainable product development, provides clarity for the development team and increases the chances of product success drastically.
  • Exceeds the customer’s expectation. A great Product Owner truly understands the customer’s intentions and goals with the product and is able to outstrip its expectations. Customer delight is the ultimate goal!
  • Is empowered. A great Product Owner is empowered to take decisions related to the product. Sure, creating support for his decisions might take some time, but swiftly taking important decisions is a primary condition for a sustainable pace of the development team.
  • Orders the product backlog. A great Product Owner understands that the product backlog should be ordered. Priority, risk, value, learning opportunities and dependencies are all taken into account and balanced with each other. For example, when building a house the roof might have the highest priority considering possible rain. But still it’s necessary to realize the foundation and walls earlier and therefore order them above the construction of the roof.
  • Prefers face-to-face communication. A great Product Owner understands that the best way to convey information is face-to-face communication. User stories are explained in a personal conversation. If a tool is used for backlog management, its function is to support the dialogue. It never replaces the good old-fashioned conversation.
  • Knows modeling techniques. A great Product Owner has a backpack full of valuable modeling techniques. He knows when to apply a specific model. Examples are Business Model Generation, Lean Startup or Impact Mapping. Based on these models he knows how to drive product success.
  • Shares experiences. A great Product Owner shares experiences with peers. This might be within the organization, and outside it: seminars and conferences are a great way to share experiences and gather knowledge. In addition, writing down your lessons learned can be valuable for other Product Owners.
  • Owns user story mapping. A great Product Owner should master the concept of user story mapping. It’s a technique that allows you to add a second dimension to your backlog. The visualization enables you to see the big picture of the product backlog. Jeff Patton wrote some excellent material about the concept of story mapping.
  • Has a focus on functionality. A great Product Owner has a focus on functionality and the non-functional aspects of the product. Hours or even story points are less important. The goal of the Product Owner is to maximize value for the customer. It’s the functionality that has value; therefore this is the main focus for the Product Owner.
  • Is knowledgeable. A great Product Owner has in depth (non-)functional product knowledge and understands the technical composition. For large products it might be difficult to understand all the details, and scaling the Product Owner role might be an option. However the Product Owner should always know the larger pieces of the puzzle and hereby make conscious, solid decisions.
  • Understands the business domain. A great Product Owner understands the domain and environment he’s part of. A product should always be build with its context taken into account. This includes understanding the organization paying for the development but also being aware of the latest the market conditions. Shipping an awesome product after the window of opportunity closes is quite useless.
  • Acts on different levels. A great Product Owner knows how to act on different levels. The most common way to define these levels is strategic, tactical and operational. A Product Owner should know how to explain the product strategy at board level, create support at middle management and motivate the development team with their daily challenges.
  • Knows the 5 levels of Agile planning. Within Agile, planning is done continuously. Every product needs a vision (level 1) which will provide input to the product roadmap (level 2). The roadmap is a long range strategic plan of how the business would like to see the product evolve. Based on the roadmap, market conditions and status of the product the Product Owner can plan releases (level 3). During the Sprint Planning (level 4) the team plan and agree on Product Backlog Items they are confident they can complete during the Sprint and help them achieve the Sprint Goal. The Daily Scrum (level 5) is used to inspect and adapt the team’s progress towards realizing the Sprint Goal.
  • Is available. A great Product Owner is available to the stakeholders, the customers, the development team and the Scrum Master. Important questions are answered quickly and valuable information is provided on time. The Product Owner ensures his availability never blocks the progress of the development team.
  • Is able to say ‘no’. A great Product Owner knows how and when to say no. This is probably the most obvious but most difficult characteristic to master. Saying yes to a new idea or feature is easy, it’s just another item for the product backlog. However, good backlog management encompasses creating a manageable product backlog with items that probably will get realized. Adding items to the backlog knowing nothing will happen with them only creates ‘waste’ and false expectations.
  • Acts as a “Mini-CEO”. A great Product Owner basically is a mini-CEO for his product. He has a keen eye for opportunities, focuses on business value and the Return On Investment and acts proactive on possible risks and threats. Everything with the growth (size, quality, market share) of his product taken into account.
  • Knows the different types of valid Product Backlog items. A great Product Owner can clarify the fact that the Product Backlog consists of more than only new features. Fore example: technical innovation, bugs, defects, non-functional requirements and experiments, should also be taken into account.
  • Takes Backlog Refinement seriously. A great Product Owner spends enough time refining the Product Backlog. Backlog Refinement is the act of adding detail, estimates and order to items in the Product Backlog. The outcome should be a Product Backlog that is granular enough and well understood by the whole team. On average the Development Team spends no more than 10% of the capacity of the Development Team on refinement activities. The way it is done isn’t prescribed and is up to the team. The Product Owner can involve stakeholders and the Development Team in backlog refinement. The stakeholders because it gives them the opportunity to explain their wishes and desires. The Development Team because they can clarify functional and technical questions or implications. This will ensure common understanding and increases the quality of the Product Backlog considerably. As a consequence, the opportunity to build the right product with the desired quality will also increase.

The Scrum Master

According to the Scrum Guide the Scrum Master is responsible for ensuring Scrum is understood and enacted. Scrum Masters do this by ensuring that the Scrum Team adheres to Scrum theory, practices, and rules. The Scrum Master is a servant-leader for the Scrum Team. The Scrum Master helps those outside the Scrum Team understand which of their interactions with the Scrum Team are helpful and which aren’t. The Scrum Master helps everyone change these interactions to maximize the value created by the Scrum Team.

The role of a Scrum Master is one of many stances and diversity. A great Scrum Master is aware of them and knows when and how to apply them, depending on situation and context. Everything with the purpose of helping people understand and apply the Scrum framework better.

The Scrum Master acts as a:

  • Servant Leader whose focus is on the needs of the team members and those they serve (the customer), with the goal of achieving results in line with the organization’s values, principles, and business objectives;
  • Facilitator by setting the stage and providing clear boundaries in which the team can collaborate;
  • Coach coaching the individual with a focus on mindset and behaviour, the team in continuous improvement and the organization in truly collaborating with the Scrum team;
  • Conflict navigator to address unproductive attitudes and dysfunctional behaviors;
  • Manager responsible for managing impediments, eliminate waste, managing the process, managing the team’s health, managing the boundaries of self-organization, and managing the culture;
  • Mentor that transfers agile knowledge and experience to the team;
  • Teacher to ensure Scrum and other relevant methods are understood and enacted.

A Great Scrum Master…

  • Involves the team with setting up the process. A great Scrum Master ensures the entire team supports the chosen Scrum process and understands the value of every event. The daily Scrum for example is planned at a time that suits all team members. A common concern about Scrum is the amount of ‘meetings’, involving the team with planning the events and discussing the desired outcome will increase engagement for sure.
  • Understands team development. A great Scrum Master is aware of the different phases a team will go through when working as a team. He understands Tuckman’s different stages of team development: forming, storming, norming, performing and adjourning. The importance of a stable team composition is therefore also clear.
  • Understands principles are more important than practices. Without a solid, supported understanding of the agile principles, every implemented practice is basically useless. It’s an empty shell. An in-depth understanding of the agile principles by everyone involved will increase the chances of successful usage of practices drastically.
  • Recognizes and acts on team conflict. A great Scrum Master recognizes team conflict in an early stage and can apply different activities to resolve it. A great Scrum Master understands conflict isn’t necessarily wrong. Healthy conflict and constructive disagreement can be used to build an even stronger team.
  • Dares to be disruptive. A great Scrum Master understands some changes will only occur by being disruptive. He knows when it’s necessary and is prepared to be disruptive enough to enforce a change within the organization.
  • Is aware of the smell of the place. A great Scrum Master can have an impact on the culture of the organization so that the Scrum teams can really flourish. He understands that changing people’s behavior isn’t about changing people, but changing the context which they are in: the smell of the place.
  • Is both dispensable and wanted. A great Scrum Master has supported the growth of teams in such a manner they don’t need him anymore on daily basis. But due to his proven contribution he will get asked for advice frequently. His role has changed from a daily coach and teacher to a periodical mentor and advisor.
  • Let his team fail (occasionally). A great Scrum Master knows when to prevent the team from failing but also understands when he shouldn’t prevent it. The lessons learned after a mistake might be more valuable than some good advice beforehand.
  • Encourages ownership. A great Scrum Master encourages and coaches the team to take ownership of their process, task wall and environment.
  • Has faith in self-organization. A great Scrum Master understands the power of a self-organizing team. “Bring it to the team” is his daily motto. Attributes of self-organizing teams are that employees reduce their dependency on management and increase ownership of the work. Some examples are: they make their own decisions about their work, estimate their own work, have a strong willingness to cooperate and team members feel they are coming together to achieve a common purpose through release goals, sprint goals and team goals.
  • Values rhythm. A great Scrum Master understands the value of a steady sprint rhythm and does everything to create and maintain it. The sprint rhythm should become the team’s heartbeat, which doesn’t cost any energy. Everyone knows the date, time and purpose of every Scrum event. They know what is expected and how to prepare. Therefore a complete focus on the content is possible.
  • Knows the power of silence. A great Scrum Master knows how to truly listen and is comfortable with silence. Not talking, but listening. He is aware of the three levels of listening – level 1 internal listening, level 2 focused listening, level 3 global listening, and knows how to use them. He listens carefully to what is said, but also to what isn’t said.
  • Observes. A great Scrum Master observes his team with their daily activities. He doesn’t have an active role within every session. The daily Scrum, for example, is held by the team for the team. He observes the session and hereby has a more clear view to what is being discussed (and what isn’t) and what everyone’s role is during the standup.
  • Shares experiences. Great Scrum Masters shares experiences with peers. This might be within the organization, but also seminars and conferences are a great way to share experiences and gather knowledge. Of course writing down and sharing your lessons learned is also highly appreciated. And yes, for the attentive readers, this is exactly the same as for the Product Owner and the Development Team.
  • Has a backpack full of different retrospective formats. A great Scrum Master can apply lots of different retrospective format. This ensures the retrospective will be a fun and useful event for the team. He knows what format is most suitable given the team’s situation. Even better: he supports the team by hosting their own retrospective. To improve involvement this is an absolute winner!
  • Can coach professionally. A great Scrum Master understands the power of professional coaching and has mastered this area of study. Books like Coaching Agile Teams and Co-Active Coaching don’t have any secrets for him. He knows how to guide without prescribing. He can close the gap between thinking about doing and actually doing; he can help the team members understand themselves better so they can find news ways to make the most of their potential. Yes, these last few sentences are actually an aggregation of several coaching definitions, but it sounds quite cool!
  • Has influence at organizational level. A great Scrum Master knows how to motivate and influence at tactic and strategic level. Some of the most difficult impediments a team will face occur at these levels; therefore it’s important a Scrum Master knows how to act at the different levels within an organization.
  • Prevent impediments. A great Scrum Master not only resolves impediments, he prevents them. Due to his experiences he is able to ‘read’ situations and hereby act on them proactively.
  • Isn’t noticed. A great Scrum Master isn’t always actively present. He doesn’t disturb the team unnecessary and supports the team in getting into the desired ‘flow’. But when the team needs him, he’s always available.
  • Forms a great duo with the Product Owner. A great Scrum Master has an outstanding partnership with the Product Owner. Although their interests are somewhat different, the Product Owner ‘pushes’ the team, the Scrum Master protects the team. A solid partnership is extremely valuable for the Development Team. Together they can build the foundation for astonishing results.
  • Allows leadership to thrive. A great Scrum Master allows leadership within the team to thrive and sees this as a successful outcome of their coaching style. They believe in the motto “leadership isn’t just a title, it’s an attitude”. And it’s an attitude everyone in the team can apply.
  • Is familiar with gamification. A great Scrum Master is able to use the concepts of game thinking and game mechanics to engage users in solving problems and increase users’ contribution.
  • Understands there’s more than just Scrum. A great Scrum Master is also competent with XP, Kanban and Lean. He knows the strengths, weaknesses, opportunities and risks of every method/framework/principle and how & when to use them. He tries to understand what a team wants to achieve and helps them become more effective in an agile context.
  • Leads by example. A great Scrum Master is someone that team members want to follow. He does this by inspiring them to unleash their inner potential and showing them the desired behavior. At difficult times, he shows them how to act on it; he doesn’t panic, stays calm and helps the team find the solution. Therefore a great Scrum Master should have some resemblance to Gandalf. The beard might be a good starting point 🙂
  • Is a born facilitator. A great Scrum Master has facilitation as his second nature. All the Scrum events are a joy to attend, and every other meeting is well prepared, useful and fun, and has a clear outcome and purpose.

The Development Team

According to the Scrum Guide the Development Team consists of professionals who do the work of delivering a potentially releasable Increment of “Done” product at the end of each Sprint. Only members of the Development Team create the Increment. Development Teams are structured and empowered by the organization to organize and manage their own work. The resulting synergy optimizes the Development Team’s overall efficiency and effectiveness.

Development Teams have the following characteristics:

  • Self-organizing. They decide how to turn Product Backlog Items into working solutions.
  • Cross-functional. As a whole, they’ve got all the skills necessary to create the product Increment.
  • No titles. Everyone is a Developer, no one has a special title.
  • No sub-teams in the Development team.
  • Committed to achieving the Sprint Goal and delivering a high quality increment

A Great Development Team

  • Pursues technical excellence. Great Development Teams use Extreme Programming as a source of inspiration. XP provides practices and rules that revolve around planning, designing, coding and testing. Examples are refactoring (continuously streamlining the code), pair programming, continuous integration (programmers merge their code into a code baseline whenever they have a clean build that has passed the unit tests), unit testing (testing code at development level) and acceptance testing (establishing specific acceptance tests).
  • Applies team swarming. Great Development Teams master the concept of ‘team swarming’. This is a method of working where a team works on just a few items at a time, preferably even one item at a time. Each item is finished as quickly as possible by having many people work on it together, rather than having a series of handoffs.
  • Uses spike solutions. A spike is a concise, timeboxed activity used to discover work needed to accomplish a large ambiguous task. Great Development Teams uses spike experiments to solve challenging technical, architectural or design problems.
  • Refines the product backlog as a team. Great Development Teams consider backlog refinement a team effort. They understand that the quality of the Product Backlog is the foundation for a sustainable development pace and building great products. Although the Product Owner is responsible for the product backlog, it’s up to the entire team to refine it.
  • Respects the Boy Scout Rule. Great Development Teams use the Boy Scout Rule: always leave the campground cleaner than you found it. Translated to software development: always leave the code base in a better state than you’ve found it. If you find messy code, clean it up, regardless of who might have made the mess.
  • Criticizes ideas, not people. Great Development Teams criticize ideas, not people. Period.
  • Share experiences. Great Development Teams share experiences with peers. This might be within the organization, but also seminars and conferences are a great way to share experiences and gather knowledge. Of course writing down and sharing your lessons learned is also highly appreciated. And yes, for the attentive readers, this is exactly the same as for the Product Owner.
  • Understands the importance of having some slack. Great Development Teams have some slack within their sprint. Human beings can’t be productive all day long. They need time to relax, have a chat at the coffee machine or play table football. They need some slack to be innovative and creative. They need time to have some fun. By doing so, they ensure high motivation and maximum productivity. But slack is also necessary to handle emergencies that might arise; you don’t want your entire sprint to get into trouble when you need to create a hot-fix. Therefore: build in some slack! And when the sprint doesn’t contain any emergencies: great! This gives the team the opportunity for some refactoring and emergent design. It’s a win-win!
  • Has fun with each other. Great Development Teams ensure a healthy dose of fun is present every day. Fostering fun, energy, interaction and collaboration creates an atmosphere in which the team will flourish!
  • Don’t have any Scrum ‘meetings’. Great Development Teams consider the Scrum events as opportunities for conversations. Tobias Mayer describes this perfectly in his book ‘The Peoples Scrum’: “Scrum is centered on people, and people have conversations. There are conversations to plan, align, and to reflect. We have these conversations at the appropriate times, and for the appropriate durations in order to inform our work. If we don’t have these conversations, we won’t know what we are doing (planning), we won’t know where we are going (alignment), and we’ll keep repeating the same mistakes (reflection).”
  • Knows their customer. Great Development Teams know their real customer. They are in direct contact with them. They truly understand what they desire and are therefore able to make the right (technical) decisions.
  • Can explain the (business) value of non-functional requirements. Great Development Teams understand the importance for non-functional requirements like e.g. performance, security and scalability. They can explain the (business) value to their Product Owner and customer and hereby ensure its part of the product backlog.
  • Trust each other. Great Development Teams trust each other. Yes, this is obvious. But without trust it’s impossible for a team to achieve greatness.
  • Keep the retrospective fun. Great Development Teams think of retrospective formats themselves. They support the Scrum Master with creative, fun and useful formats and offer to facilitate the sessions themselves.
  • Deliver features during the sprint. Great Development Teams deliver features continuously. Basically they don’t need sprints anymore. Feedback is gathered and processed whenever an item is ‘done’; this creates a flow of continuous delivery.
  • Don’t need a sprint 0. Great Development Teams don’t need a sprint 0 before the ‘real’ sprints start. They are able to deliver business value in the first sprint.
  • Acts truly cross-functional. Great Development Teams not only have a cross-functional composition and act truly cross-functionally. They don’t talk about different roles within the team but are focused on delivering a releasable product each sprint as a team. Everyone is doing the stuff that’s necessary to achieve the sprint goal.
  • Updates the Scrum board themselves. Great Development Teams ensure the Scrum/team board is always up-to-date. It’s an accurate reflection of the reality. They don’t need a Scrum Master to encourage them; instead they collaborate with the Scrum Master to update the board.
  • Spends time on innovation. Great Development Teams understand the importance of technical/architectural innovation. They know it’s necessary to keep up with the rapidly changing environment and technology. They ensure they have time for innovation during regular working hours, and that it’s fun and exciting!
  • Don’t need a Definition of Done. Great Development Teams deeply understand what ‘done’ means for them. For the team members, writing down the Definition of Done isn’t necessary anymore. They know. The only reason to use it is to make the ‘done state’ transparent for their stakeholders.
  • Knows how to give feedback. Great Development Teams have learned how to give each other feedback in an honest and respectful manner. They grasp the concept of the ‘Situation – Behavior – Impact Feedback Tool’ and hereby provide clear, actionable feedback. They give feedback whenever it’s necessary, and don’t postpone feedback until the retrospective.
  • Manages their team composition. Great Development Teams manage their own team composition. Whenever specific skills are necessary, they collaborate with other teams to discuss the opportunities of ‘hiring’ specific skills.
  • Practice collective ownership. Great Development Teams understand the importance of collective ownership. Therefore they rotate developers across different modules of the applications and systems to encourage collective ownership.
  • Fix dependencies with other teams. Great Development Teams are aware of possible dependencies with other teams and manage these by themselves. Thereby ensuring a sustainable development pace for the product.
  • Don’t need story points. Great Development Teams don’t focus on story points anymore. They’ve refined the product backlog so that the size for the top items don’t vary much. They know how many items they can realize each sprint. Counting the number of stories is enough for them.

About the Author

Barry Overeem is an Agile Coach at Prowareness and Professional Scrum Trainer at He is an active member of the Agile community and shares his insights and knowledge by speaking at conferences and writing articles. Since 2000 he fulfilled several roles with a software development environment, these vary from application consultant, project manager and team lead. Since 2010 his primary focus is applying the Agile mindset and the Scrum Framework. Barry is specialized in the role of the Scrum Master and helping people understand the spirit of Scrum and hereby using the Scrum framework better. Due his own practical experience as a Scrum Master, Barry gained a lot of experience with starting new teams, coaching teams through the different stages of team development and applying different types of leadership. Sharing these experiences and hereby contributing to other persons growth is his true passion!


Fuente: Characteristics of a Great Scrum Team

Of Gods and Procrastination: Agile Management

Of Gods and Procrastination: Agile Management

Common errors in Agile management, and how to get the best quality code from your developers without tying them to their keyboards.

Quickly embraced by some, completely ignored by others, Agile management spent the last decade and a half gaining popularity. Today it has become one of the most trendy phenomena, and most companies claim to have adapted and applied its principles. And indeed, that is true, yet almost always in a modified, and even sometimes in a corrupted, way.

Most of us in software development have been in different teams and projects, as developers in some, and as team leads & managers in others. We coped with clients in kick-off meetings, decoding acceptance criteria and wireframes, defining the MVP, accepting changes in the requirements, and modifying the scope in the middle of the sprint. We made mistakes and learned the hard way. Mistakes that we should not commit again. Yet sometimes, we see them happening again in our own team, when visiting clients, or meeting teams with whom we will collaborate, etc. Today, I want to focus on the troubles I see the most among managers with software developer teams.

Agile Management Common Errors

Meeting a Tight Deadline by Adding Additional Resources to the Team

You check the estimation for the running sprint and find out that the team is currently 16 hours behind the schedule. It seems like no problem at all; you have a developer that just finished his project, he will join the struggling team for two days, and you guys will be back on track.

By doing this, the team probably won’t meet the deadline. Why? The mistake here is to think that a developer performs easy, automatable work. Well, he doesn’t. Even if he is familiar with the technologies used in the project, he still needs an introduction to what the project is about, what has already been done, where all the documentation is, what wireframes are still being reviewed, and a very long list of other things. Besides the fact that the productivity of this developer will be low (at least in the beginning), he will also need to be assisted by the rest of the team to be taught everything described above. A task that is usually assigned to one particular person. Well, that person’s productivity will also drop as he will be distracted and will have to spend hours helping and training the new member of the team.

Parkinson’s Law

In the beginning, Parkinson’s Law had a different meaning. Today it states that work expands to fill the time available for its completion. Let’s say we are having a performance issue with a particular project. So we assign one of our best developers to investigate what the problem is and to fix the bottleneck. To do that, we decide to spare a week of time.

Parkinson’s Law is not always present in a developer; sometimes it’s stronger, sometimes it’s weaker. Yet freedom of time in a task, especially if it doesn’t have a clear where-to-end point, will lead to its expansion, occupying all the time available for it. Sometimes it can even be worse, where the resulting solution suffers from over-engineering that adds unnecessary complexity to the project. To avoid this situation, the task must have strict goals and acceptance criteria, or otherwise, a frequent polling in order to see the results achieved so far.

Multiple Projects and Distributed Teams

Although out there you can find an enormous amount of articles on how to succeed with distributed teams applying Agile, keeping in mind an even distribution of tasks, understanding time and cultural differences, etc, not all projects are good to do such a thing. Especially when each developer forms a part of several different teams at the same time. In the attempt to make each one of the developers an ‘all-terrain’ developer, they end up working on multiple projects at the same time; for instance, Monday and Tuesday on project A, and from Wednesday to Friday on project B.

You have probably heard about the developer’s focus. This is when the developer is in their most productive state of mind. Once they lose that focus, however, it takes some time to get it back. Well, the same thing happens when switching between projects and checking what was missed while being away. Although some tools help to mitigate this effect, Dailies, Jira or Trello, for example, it still drags down one’s productivity.

It’s easy to disagree about this topic, but by default, I do defend the 6th principle of the Agile Manifesto, for both ‘co-location’ and ‘co-time’:

“The most efficient and effective method of conveying information to and within a development team is face-to-face conversation.”

Time Pressure Increases Productivity

It’s surprising how common it is to discover among your co-workers and professional colleagues that they had the same dark experience. Often in a startup or in a small company, with a very talented manager that made them ‘believe in themselves and in the product,’ which ended up in countless extra hours, tons of stress, and impossible-to-meet deadlines. As developers accept such working conditions, it allows their manager to presume that the team is quite productive, as the ratio of delivered functionalities and time is unbelievably high.

In fact, the productivity of the team might be considered high, as long as you don’t take into account the quality of the work done and the extra (and unpaid) hours. Those kinds of conditions inevitably lead to two results:

  • First, the quality of the code goes down hill. This means more bugs passing through (no need to explain that a bug in production is hundreds of times more expensive for the company than the time needed to get it done right in first place), the test coverage fails, functionality is implemented partially, etc. Under time pressure, developers won’t work more, developers will work faster. Basically quantity over quality.
  • Second, at a certain point, the developer ends up quitting. Once he realizes that things won’t change and that his workaholic attitude and commitment to the company isn’t really good for him, he will definitely find another place to try his luck. Usually, the one that leaves is the one that had the most pressure, i.e. the one that can actually leave because he knows he has another option. And guess what? That’s the developer that you can’t let go.

To finish with this paper, I would like to point out the link between the code of a developer and his motivation. Usually, Software Development is not just something that ‘pays the bills,’ it’s someone’s passion. For a developer, to have the chance to write quality code means a burst of motivation. Take that chance away, and the motivation is gone.

And when a manager’s focus for the team is to keep up the productivity, the quality rarely increases in a significant way. When the focus is on quality, however, productivity tends to soar.

Fuente: Of Gods and Procrastination: Agile Management – DZone Agile

HTML5 vs. Native: The Debate Is Over

This editorial covers the pros and cons of HTML5 and native mobile app development. See how native app development can give you an edge.

Marketplace analysts and mobile strategists love telling us that the debate of HTML5 vs. local apps is highly overstated.  What’s more important is the general approach, they say. The readiness of your enterprise systems for mobile. The supply of mobile talents. The options of your improvement groups. The fee/gain analysis, etc.

But from what we’ve seen, the debate is pretty much over. While HTML5 has made good progress within the past few years, and while app builders still cite HTML5 as their most-used mobile platform (understandably, given their generic ability units!), the market definitely dictates the selection of native mobile structures.

Customers Decide on Local Apps

Studies indicate that mobile experience has a massive effect on how purchasers view an emblem and interact with it. Oracle recently had a look at has discovered that more than half (55%) of surveyed customers said a bad app might put them off using a company’s services or products.

If you are a consumer-targeted, market-pushed company, probabilities are that your customers have already spoken — and you’ve taken notice.  What we see again and again is organizations choosing an HTML5 or hybrid app and getting bitterly disenchanted with their customers’ reaction. The organizations who put out local apps, on the other hand, have an immediate, aggressive advantage.

Working example: Delta’s native Fly Delta app and its superior overall performance vs. its competitors’ web-based apps. Customers who fly often depend upon consumer-pleasant native features to quickly get records, submit itinerary modifications, and more.  Web (HTML5) apps clearly do not offer the easy experience local apps are well-known for. Domino Pizza, likewise, dazzled users with its overhauled local app, leaving its competition to play catch-up.

Why Do Local Apps Stand Out?

Allow’s look at some variations between the dominant mobile strategies currently in play in establishments.

Cell Website


  • A brief choice for making existing content available via a mobile tool.
  • Simplest needs to be built once and could be usable for every tool.


  • Inferior user experience: Users anticipate their mobile apps being specific: a compelling person interface; unique functions taking benefit of mobile device hardware; applicable push messaging. Those are notably limited with mobile websites, which are commonly supposed to genuinely show information.
  • Inferior performance: Prolonged load instances and incapability to feature in offline or low-bandwidth mode may be a prime turnoff. In line with Flurry, mobile users spend 86% of their time on mobile internal apps rather than in the browser.

HTML5 or Hybrid Mobile App


  • A mobile app built with HTML5 or comparable technology can come up with the ‘area’ on a consumer’s mobile phone that can be used to bridge some of the gaps among native apps and a mobile website.
  • HTML5 or hybrid (wrapped) apps are an appealing route for teams that have Internet development capabilities and want to build mobile apps to provide extra engagement.


  • An HTML5 app is, basically, a mobile website wrapped in ‘sheep’s clothing.’ all the UX/overall performance issues mentioned above will nonetheless follow. As those apps require a constant net connection, they restrict the customers’ functionality to utilize them in low/no bandwidth areas.
  • Those apps could be built with the exact same feature for every tool; without a unique function that customers have come to expect in a mobile experience.

Facebook CEO Mark Zuckerberg has stated that making a bet on HTML5 became his biggest mistake with mobile.

Native Mobile App

Custom, native mobile apps offer excellent consumer enjoyment with the most functionality.


  • Rich, clean UI.
  • Use of cutting edge device talents.
  • Faster load instances compared to web apps.
  • Sturdy performance in online or offline.
  • Higher discoverability.
  • Superior security in comparison to HTML5.


  • Require improvement for each mobile platform.
  • Local improvement capabilities are expensive!

Fuente: HTML5 vs. Native: The Debate Is Over – DZone Mobile

Restful API Design: An Opinionated Guide

One developer’s opinion on what constitutes good API design. This journey touches on URL formats to error handling to verbs in URLs.

This is very much an opinionated rant about APIs, so it’s fine if you have a different opinion. These are just my opinions. Most of the examples I talk through are from the Stack Exchange or GitHub API — this is mostly just because I consider them to be well-designed APIs that are well-documented, have non-authenticated public endpoints, and should be familiar domains to a lot of developers.

URL Formats


OK, let’s get straight to one of the key aspects. Your API is a collection of URLs that represent resources in your system that you want to expose. The API should expose these as simply as possible — to the point that if someone was just reading the top level URLs, they would get a good idea of the primary resources that exist in your data model (e.g. any object that you consider a first-class entity in itself). The Stack Exchange API is a great example of this. If you read through the top level URLs exposed, you will probably find they match the kind of domain model you would have guessed:

  • /users
  • /questions
  • /answers
  • /tags
  • /comments

And while there is no expectation that there will be anyone attempting to guess your URLs, I would say these are pretty obvious. What’s more, if I was a client using the API, I could probably have a fair shot and understanding these URLs without any further documentation of any kind.

Identifying Resources

To select a specific resource based on a unique identifier (an ID, a username, etc.) then the identifier should be part of the URL. Here we are not attempting to search or query for something, rather we are attempting to access a specific resource that we believe should exist. For example, if I were to attempt to access the GitHub API for my username,, I am expecting the concrete resource to exist.

The pattern is as follows (elements in square braces are optional):


Where including an identifier will return just the identified resource, assuming one exists, otherwise returning a 404 Not Found (so this differs from filtering or searching where we might return a 200 OK and an empty list) — although this can be flexible. If you prefer to return an empty list also for identified resources that don’t exist, this is also a reasonable approach, once again, as long as it is consistent across the API (the reason I go for a 404 if the ID is not found is that normally, if our system is making a request with an ID, it believes that the ID is valid, and if it isn’t, then it’s an unexpected exception, compared to if our system was querying filtering user by sign-up dates then its perfectly reasonable to expect the scenario where no user is found).


A lot of the time our data model will have natural hierarchies — for example, StackOverflow Questions might have several child Answers, etc. These nested hierarchies should be reflected in the URL hierarchy. If we look at the Stack Exchange API for the previous example:


Again, the URL is (hopefully) clear without further documentation what the resource is: At a glance, it’s clear that the URL is all answers that belong to the identified questions.

This approach naturally allows as many levels of nesting as necessary using the same approach, but as many resources are top-level entities as well, this prevents you from needing to go much further than the second level. To illustrate, let’s consider we wanted to extend the query from all answers to a given question to instead query all comments for an identified answer — we could naturally extend the previous URL pattern as follows


But as you have probably recognized, we have /answers as a top-level URL, so the additional prefixing of /questions/{ids} is surplus to our identification of the resource (and actually, supporting the unnecessary nesting would also mean additional code and validation to ensure that the identified answers are actually children of the identified questions).

There is one scenario where you may need this additional nesting, and that is when a child resource’s identifier is only unique in the context of its parent. A good example of this is GitHub’s user and repository pairing. My GitHub username is a global, unique identifier, but the name of my repositories are only unique to me (someone else could have a repository the same name as one of mine — as is frequently the case when a repository is forked by someone). There are two good options for representing these resources:

  1. The nested approach described above. So for the GitHub example, the URL would look like:
    /users/{username}/repos/{reponame}. I like this, as it’s consistent with the recursive pattern defined previously, and it is clear what each of the variable identifiers is relating to.
  2. Another viable option, the approach that GitHub actually uses is as follows:
    /repos/{username}/{reponame}. This changes the repeating pattern of {RESOURCE}/{IDENTIFIER} (unless you just consider the two URL sections as the combined identifier). However, the advantage is that the top-level entity is what you are actually fetching — in other words, the URL is serving a repository, so that is the top level entity.

Both are reasonable options and really come down to preference. As long as it’s consistent across your API, then either is OK.

Filtering and Additional Parameters

Hopefully, the above is fairly clear and provides a high-level pattern for defining resource URLs. Sometimes, we want to go beyond this and filter our resources — for example, we might want to filter StackOverflow questions by a given tag. As hinted at earlier, we are not sure of any resources existence here, we are simply filtering — so unlike with an incorrect identifier, we don’t want to 404 Not Found the response, rather return an empty list.

Filtering controls should be entered as part of the URL query parameters (e.g. after the first ? in the URL). Parameter names should be specific and understandable and lower case. For example:


All the parameters are clear and make it easy for the client to understand what is going on (also worth noting that, for example, returns an empty list, not a 404 Not Found). You should also keep your parameter names consistent across the API. If you support common functions such as sorting or paging on multiple endpoints, make sure the parameter names are the same.


As should be obvious in the previous sections, we don’t want verbs in our URLs, so you shouldn’t have URLs like /getUsers or /users/list, etc. The reason for this is the URL defines a resource, not an action. Instead, we use the HTTP methods to describe the action: GET, POST, PUT, HEAD, DELETE, etc.


Like many of the RESTful topics, this is hotly debated and pretty divisive. Very broadly speaking, the two approaches to define API versioning are:

  • Part of the URL.
  • Not part of the URL.

Including the version in the URL will largely make it easier for developers to map their endpoints to versions, etc., but for clients consuming the API, it can make it harder (often they will have to go and find-and-replace API URLs to upgrade to a new version). It can also make HTTP caching harder — if a client POSTs to /v2/users, then the underlying data will change, so the cache for GET-ting users from /v2/users is now invalid. However, the API versioning doesn’t affect the underlying data, so that same POST has also invalidated the cache for /v1/users etc. The Stack Exchange API uses this approach (as of writing, their API is based at

If you choose to not include the version in your API, then two possible approaches are HTTP request headers or using content-negotiation. This can be trickier for the API developers (depending on framework support, etc.) and can also have the side effect of clients being upgraded without knowing it (e.g. if they don’t realize they can specify the version in the header, they will default to the latest).  The GitHub API uses this approach

I think this sums it up quite nicely:

Response Format

JSON is the RESTful standard response format. If required, you can also provide other formats (XML, YAML, etc.), which would normally be managed using content negotiation.

I always aim to return a consistent response message structure across an API. This is for ease of consumption and understanding across calling clients.

Normally, when I build an API, my standard response structure looks something like this:

[ code: "200", response: [ /** some response data **/ ] ]

This does mean that any client always needs to navigate down one layer to access the payload, but I prefer the consistency this provides, and it also leaves room for other metadata to be provided at the top level (for example, if you have rate limiting and want to provide information regarding remaining requests, etc., this is not part of the payload but can consistently sit at the top level without polluting the resource data).

This consistent approach also applies to error messages — the code (mapping to HTTP status codes) reflects the error and the response, in this case, is the error message returned.

Error Handling

Make use of the HTTP status codes appropriately for errors. 2XX status codes for successful requests, 3XX status codes for redirecting, 4xx codes for client errors, and 5xx codes are for server errors (you should avoid ever intentionally returning a 500 error code — these should be used for when unexpected things go wrong within your application).

I combine the status code with the consistent JSON format described above.

Fuente: Restful API Design: An Opinionated Guide – DZone Integration

How to Select App Development Frameworks: Native vs. Web vs. Hybrid

When selecting frameworks with which to build apps, the biggest consideration is what kind of apps organizations plan to deliver: native, Web, or hybrid.

As companies start to build their own mobile apps, they’ll have to choose app development frameworks.

It’s an important first step to take an existing process and make it accessible from a mobile device. But apps become transformative when they take advantage of the rich features a mobile device can provide, such as location-based services, push notifications and seamless data sharing. When selecting frameworks with which to build these apps, the biggest consideration is what kind of apps organizations plan to deliver: native, Web, or hybrid.

Native App Development

Building a native application always results in the best user experience. Mobile operating system makers invest heavily in their own development tools. They want to make sure that the apps developers write in their native languages can take advantage of all the latest OS features and perform at their best.

Companies that have no prior experience with mobile development may be hesitant to have their developers learn separate languages to write apps for both Apple iOS and Google Android, however, because of the extra effort required to support multiple OSes. In situations where a company is standardized on one OS or requires tight hardware integration — such as with embedded devices, kiosks, point of sale and other special company-issued devices — it may be more important to develop native applications.

Browser-based Web Apps

At the other end of the spectrum are Web apps that users access in mobile browsers. Developers can write these with more familiar Web technologies, such as HTML5, CSS, and JavaScript. One version of a Web app can run on multiple mobile OSes. In addition, advances in HTML5 mean Web apps can now do many of the things that native mobile apps can, such as take advantage of cameras and geolocation and launch other apps on the device from within the Web app.

They’re attractive, but browser-based apps have many drawbacks. It’s only possible to send users push notifications with a few specific browsers, and app logins and sessions can expire quickly. Plus, Web apps can’t take advantage of all the latest OS features — some of the most unique and important traits that make mobile apps so valuable.

For all these reasons, mobile browser-based Web apps have limited use cases in the enterprise.Another problem: Web apps are just less convenient than native apps. Users have to remember how to navigate them, dig into bookmark menus to find them or know how to place Web bookmarks on their home screens. IT administrators can push these bookmarks to devices with mobile device management, but they still require different management processes than other apps.

Hybrid Apps

Hybrid apps are Web apps that run inside of a native app shell. Conceptually, this approach brings the best of both worlds. Developers can write the core logic of the app in a Web-based language that’s portable across multiple mobile platforms, and the shell can use native code to interact with the device’s unique features. As a result, IT can manage and deploy a hybrid app just like any other native app.

Hybrid apps can be as simple as a Web page embedded inside a native app, or they can be much more complex. Many app development frameworks can take apps developers write in Web-based or other languages and compile them into complete native apps for different mobile OS’s. Some app development frameworks provide only the user interface for hybrid apps, using native code and all the OS-appropriate design elements.

The hybrid app concept is popular but can have drawbacks as well. Using one codebase for multiple OSes means that developers have to make some compromises. For example, iOS doesn’t have full near-field communications support, and Android and iOS apps have different navigation conventions.

In some cases, with all the extra effort needed to make the “write once, deploy anywhere” concept actually work on multiple platforms, it can be easier to just invest in native app development after all.

Other Considerations

Enterprise mobile apps also need to connect to infrastructure components for push notifications, management and security, analytics, data syncing and connections to enterprise databases and feeds.

Some people say hybrid and Web apps are good enough for enterprise apps. Others say user experience is more important these days, and apps should take advantage of the native features a mobile OS offers. But different apps and situations call for different development approaches and considerations.




Fuente: How to Select App Development Frameworks: Native vs. Web vs. Hybrid – DZone Mobile

Hybrid vs. Native: Choose in 5 minutes! – DZone Mobile

In case you’re befuddled and pondering whether to fabricate a half and half versatile application or a native portable application, don’t stress, this article will help you choose your versatile application procedure in under 5 minutes!

We have discovered curious and confused business visionaries who go insane attempting to settle on the most proficient method to approach which Mobile App is best Native Mobile app development or Hybrid app development. In case you’re befuddled and pondering whether to fabricate a half and half versatile application or a native portable application, don’t stress, this article will help you choose your versatile application procedure in under 5 minutes!

Quick liners on Hybrid Apps and Native Apps before we begin:

  • Hybrid App: Developer increases web code with native SDK. Can be effectively conveyed over numerous stages and is typically the less expensive and quicker arrangement.

  • Native App: This is stage (iOS, Android and so on.) particular and requires one of a kind mastery. However, the maximum capacity of the stage can be utilized which will drive extraordinary client experience and bigger application abilities (particularly around telephone equipment). Can be expensive taking into account prerequisite and take more time to create.

5 Questions to ask before you choose

Do You Need to Utilize Native Components in the Mobile App?

On the off chance that your application is overwhelming on native telephone ability and this is your essential USP, then native application improvement will work best. While building a hybrid app development, contingent upon the system that you receive (there are a few in the business sector), you could possibly have admittance to native components.

How Rapidly Would You Like to Take It to the Business Sector?

An ideal opportunity to market is reliant on different variables like the measure of components and number of assets you have. More assets normally imply that the spending will increment. In the event that you need to dispatch the mobile app development rapidly to the business sector with constrained assets, it is astute to run with half breed application approach, which will dispatch your application on various stages in a brief span.

Do You Have Separate Spending Plans for Designers in Ios and Android (Considering That They Dominate the Market Share)?

In the event that you can dispense separate spending plan for iPhone improvement and advancement assets, and you have freedom of time to take it to the business sector, then you don’t need to stress much; go for native application!

How Frequently Do You Have to Redesign Your Portable Application?

In the event that you have to make incessant overhauls to your application, which implies that the client will need to redesign from the mobile application frequently (and not irritating them with that), then you ought to consider a half and half application. The greatest favorable position for mixture application is that unless you have a basic change of the usefulness in the application, all the substance will be upgraded from web, specifically.

Do you need to have the best client experience?

In the event that you need to make a great Customer experience for Hybrid App Mobile Web, the native application methodology would be better. A cross breed application can never coordinate the level of innovative client encounter that you get in a native application. Notwithstanding, this doesn’t imply that the client experience of a half and half application is terrible.


The response to “Which is better?” is decently nuanced. Native applications offer the best end client experience, yet require the most particular aptitudes and are the most costly to create. Half and half applications have a lower obstruction to passage, are the least expensive to create, and focus on thetop mobile App development companies, yet don’t as a matter, of course, catch the right look and feel of what end clients may expect and for the most part won’t execute too or be as highlight rich. In the event that expense is not an issue, native applications offer the best item, however in more cost-touchy situations, cross breed applications still offer a convincing (if not exactly as great) experience. At last, the answer will go to the organization, engineer, and the end clients as to which arrangement is generally fitting.

The Mobile Zone is brought to you in partnership with Strongloop and IBM.  Visually compose APIs with easy-to-use tooling. Learn how IBM API Connect provides near-universal access to data and services both on-premises and in the cloud.

Fuente: Hybrid vs. Native: Choose in 5 minutes! – DZone Mobile

3 Tips for Selecting the Right Database for Your App 

Perhaps you’re building a brand new application. Or maybe your current database isn’t working well. Choosing the right database for your application can be overwhelming, given all the choices available today.

Having used a variety of database vendors in production, I can easily say that there is no one right answer. So, to help you along with the decision-making process, I’m going to give you three tips for selecting the right database for your application:

Tip #1: It Isn’t a SQL vs. NoSQL Decision

There are countless articles on the pros and cons of SQL and NoSQL databases. While they may provide some insight into the differences, they miss many of the important factors of the decision-making process. Most importantly, we need to select the database that supports the right structure, size, and/or speed to meet the needs of our application.

Structure focuses on how you need to store and retrieve your data. Our applications deal with data in a variety of formats, so selecting the right database includes picking the right data structures for storing and retrieving data. If you select the right data structures for persisting your data, your application will require more development effort to work around these issues and may not scale as a result.

Size is about the quantity of data you need to store and retrieve critical application data. The amount of data you can store and retrieve before the database is negatively impacted may vary based on a combination of the data structure selected, the database’s ability to partition data across multiple filesystems and servers, and vendor-specific optimizations.

Speed and scale address the time it takes to service incoming reads and writes to your application. Some databases are designed to optimize read-heavy apps, while others are designed to support write-heavy solutions. Selecting a database that can handle your app’s I/O needs goes a long way to a scalable architecture.

The important thing is to understand the needs of your application; from the structure of your data, to the size of the data and the read and write speeds you need. If you’re uncertain, you may wish to perform some data modeling to help you map out what’s needed.

This leads us to my next tip:

Tip #2: Use Data Modeling to Guide Database Selection

Data modeling helps map your application’s features into the data structure you’ll need to implement them. Starting with a conceptual model, you can identify the entities, associated attributes, and entity relationships that you’ll need. As you go through the process, it will become more apparent the type(s) of data structures you’ll need to implement. You can then use these structural considerations to select the right category of database that will serve your application best:

Relational: stores data into classifications (‘tables’), with each table consisting of one or more records (‘rows’) identified by a primary key. Tables may be related through their keys, allowing queries to join data from multiple tables together to access any/all required data. Relational databases require fixed schemas on a per-table basis that are enforced for each row in a table.

Document-oriented: stores structured information with any number of fields that may contain simple or complex values. Each document stored may have different fields, unlike SQL tables, which require fixed schemas. Some document stores support complex hierarchies of data through the use of embedded documents. Additionally, document stores offer extreme flexibility to developers, as fixed schemas do not need to be developed ahead of time. Search stores are often document-oriented databases that are optimized for data querying across one or more fields. Search-based data stores typically support additional features such as sorting by relevance and data faceting for drill-down capabilities.

Key/Value: Key/Value stores offer great simplicity in data storage, allowing for massive scalability of both reads and writes. Values are stored with a unique key (“bob”) and a value (“555-555-1212”) and may be manipulated using the following operations: Add, Reassign (Update), Remove, and Read. Some storage engines offer additional data structure management within these simple operations.

Column-oriented: similar to relational, data is stored in both rows and columns. However, columns may contain multiple values, allowing data to be fetched by row or by column for highly optimized data retrieval.

Graph: graph stores focus on storing entities and the relationships between them. These stores are very useful in navigating between entities and querying for relationships between them to any depth — something that is difficult to do with traditional relational or document databases.

As you start to map your application’s features to data structures, consider the kinds of queries you will need to support for your UI or API. Some data structures will make your mapping logic easier into and out of your application to retrieve single entities, but they may not support the kinds of ad hoc queries you may need to support more complex data retrieval and reporting.

A final note on data modeling: Don’t depend on things like database migrations and scaffolding generators to define your database structures. Data modeling will help you understand the data structures necessary to build your application. Use these tools to accelerate the implementation process, based on your database model.

Tip #3: You May Need More Than One Type of Database

During the modeling process, you may realize that you need to store your data in a specific data structure, where certain queries can’t be optimized fully. This may be due to some complex search requirements, the need for robust reporting capabilities, caching, or the requirement for a data pipeline to accept and analyze incoming data. In these situations, more than one type of database may be required for your application.

When adopting more than one database, it’s important to select one database that will own a specific set of data. This database becomes the canonical database for those entities or for a specific context. Any additional databases that work with this same data may have a copy, but they will not be considered an owner of this data.

For example, we may decide that a relational database is the best data structure for our application. However, we need to support a robust, faceted search within our application. In this case, we may choose PostgreSQL or MySQL for our canonical data store for all our entities. We then choose to use a document database such as Elasticsearch to index our entities by specific fields and facets. Elasticsearch may also store some basic details about our entities, such as name and description, so that our search results provide useful results. However, Elasticsearch does not own our entity data and we do not query it for the latest details. Instead, we consider the relational database the canonical source for the entity details and updates. We then keep Elasticsearch updated when data is changed in our relational database.

It’s important to be thoughtful when adopting more than one database. Otherwise, your application may behave inconsistently and result in frustrated customers.

Putting it All Together

To summarize the process I use for selecting a database:

  1. Understand the data structure(s) you require, the amount of data you need to store/retrieve, and the speed/scaling requirements
  2. Model your data to determine if a relational, document, columnar, key/value, or graph database is most appropriate for your data.
  3. During the modeling process, consider things such as the ratio of reads-to-writes, along with the throughput you will require to satisfy reads and writes.
  4. Consider the use of multiple databases to manage data under different contexts/usage patterns.
  5. Always use a master database to store and retrieve canonical data, with one or more additional databases to support additional features such as searching, data pipeline processing, and caching.

Fuente: 3 Tips for Selecting the Right Database for Your App – DZone Database

re: Why Uber Engineering Switched From Postgres to MySQL

The Uber Engineering group have posted a really great blog post about their move from Postgres to MySQL. I mean that quite literally, it is a pleasure to read, especially since they went into such details as the on-disk format and the implications of that on their performance.

Image title

For fun, there is another great post from Uber, about moving from MySQL to Postgres, which also has interesting content.

Go ahead and read both, and we’ll talk when you are done. I want to compare their discussion to what we have been doing.

In general, Uber’s issue falls into several broad categories:

  • Secondary indexes cost on write
  • Replication format
  • The page cache vs. buffer pool
  • Connection handling

Secondary Indexes

Postgres maintains a secondary index that points directly to the data on disk, while MySQL has a secondary index that has another level of indirection. The images show the difference quite clearly:

Postgres MySQL
Postgres_Tuple_Property_ MySQL_Index_Property_

I have to admit that this is the first time that I ever considered the fact that the indirection’s manner might have any advantage. In most scenarios, it will turn any scan on a secondary index into an O(N * logN) cost, and that can really hurt performance. With Voron, we have actually moved in 4.0 from keeping the primary key in the secondary index to keeping the on disk position, because the performance benefit was so high.

That said, a lot of the pain the Uber is feeling has to do with the way Postgres has implemented MVCC. Because they write new records all the time, they need to update all indexes, all the time, and after a while, they will need to do more work to remove the old version(s) of the record. In contrast, with Voron we don’t need to move the record (unless its size changed), and all other indexes can remain unchanged. We do that by having a copy on write and a page translation table, so while we have multiple copies of the same record, they are all in the same “place”, logically, it is just the point of view that changes.

From my perspective, that was the simplest thing to implement, and we get to reap the benefit on multiple fronts because of this.

Replication Format

Postgres send the WAL over the wire (simplified, but easier to explain) while MySQL send commands. When we had to choose how to implement over the wire replication with Voron, we also sent the WAL. It is simple to understand, extremely robust and we already had to write the code to do that. Doing replication using it also allows us to exercise this code routinely, instead of it only running during rare crash recovery.

However, sending the WAL has issues, because it modify the data on disk directly, and issue there can cause severe problems (data corruption, including taking down the whole database). It is also extremely sensitive to versioning issues, and it would be hard if not impossible to make sure that we can support multiple versions replicating to one another. It also means that any change to the on disk format needs to be considered with distributed versioning in mind.

But what killed it for us was the fact that it is almost impossible to handle the scenario of replacing the master server automatically. In order to handle that, you need to be able to deterministically let the old server know that it is demoted and should accept no writes, and the new server that it can now accept writes and send its WAL onward. But if there is a period of time in which both can accept write, then you can’t really merge the WAL, and trying to is going to be really hard. You can try using distributed consensus to run the WAL, but that is really expensive (about 400 writes / second in our benchmark, which is fine, but not great, and impose a high latency requirement).

So it is better to have a replication format that is more resilient to concurrent divergent work.

OS Page Cache Vs Buffer Pool

From the post:

Postgres allows the kernel to automatically cache recently accessed disk data via the page cache. … The problem with this design is that accessing data via the page cache is actually somewhat expensive compared to accessing RSS memory. To look up data from disk, the Postgres process issues lseek(2)and read(2) system calls to locate the data. Each of these system calls incurs a context switch, which is more expensive than accessing data from main memory. … By comparison, the InnoDB storage engine implements its own LRU in something it calls the InnoDB buffer pool. This is logically similar to the Linux page cache but implemented in userspace. While significantly more complicated than Postgres’s design…

So Postgres is relying on the OS Page Cache, while InnoDB implements its own. But the problem isn’t with relying on the OS Page Cache, the problem is how you rely on it. And the way Postgres is doing that is by issuing (quite a lot, it seems) system calls to read the memory. And yes, that would be expensive.

On the other hand, InnoDB needs to do the same work as the OS, with less information, and quite a bit of complex code, but it means that it doesn’t need to do so many system calls, and can be faster.

Voron, on the gripping hand, relies on the OS Page Cache to do the heavy lifting, but generally issues very few system calls. That is because Voron memory map the data, so access it is usually a matter of just pointer dereference, the OS Page Cache make sure that the relevant data is in memory and everyone is happy. In fact, because we memory map the data, we don’t have to manage buffers for the system calls, or to do data copies, we can just serve the data directly. This ends up being the cheapest option by far.

Connection Handling

Spawning a process per connection is something that I haven’t really seen since the CGI days. It seems pretty harsh to me, but it is probably nice to be able to kill a connection with a kill –9, I guess. Thread per connection is also something that you don’t generally see. The common situation today, and what we do with RavenDB, is to have a pool of threads that all manage multiple connections at the same time, often interleaving execution of different connections using async/await on the same thread for better performance.

Fuente: re: Why Uber Engineering Switched From Postgres to MySQL – DZone Database