Subscribe by Email


Sunday, October 19, 2014

How is Agile Unified Process (AUP) different from Rational Unified Process (RUP) ?

Agile Unified Process uses Agile Modelling, which is used for describing the modelling and documentation process based on agile disciplines. The Rational Unified Process (RUP) is also an agile methodology. Here we highlight the differences between the two. Both of these processes are divided into disciplines or workflows, which are carried out in iterations. Agile Unified Process (AUP) is derived from RUP, and we can say it is a simpler version of it. There are 3 disciplines followed by RUP i.e., Modelling, Requirements and Analysis and Design. The AUP being the simple version combines all of them into one discipline. It is because of this that it is easy for the RUP teams to migrate to AUP if they want to. Thus RUP is very flexible and can be merged with agile modeling practices.

> Active stakeholder participation: In RUP projects, stakeholders including customers, users, managers and operators, are often involved as a part of the project disciplines. It is necessary for the team to assign modelling roles like requirements specifier, business process designer etc., to the participating stakeholders. The activeness of the stakeholders is related to less requirement of feedback in the Agile Unified Process.

> Agile Modelling Standards: A significant part in AUP is played by the UML (Unified Modelling Language) diagrams. For maintaining its agility, the agile teams often blend the standards and the guidelines together. On the other side, in an RUP project, the guidelines to be adopted by the teams for creating modelling artifacts are included.

> Gentle application of the patterns: The AUP teams get the full freedom for choosing which modelling patterns to use. However, in RUP, these patterns are defined by the product depending on the modelling disciplines being followed. This practice has led to an enhancement in the performance of Agile Unified Process by easing the way to apply patterns. But, this concept is not as explicit as it should be.

> Application of the right artifacts: The advice for creation of various types of models is provided by the AUP and it is one of its strengths. The recent version of the RUP also provides plenty of advice on creating non – UML artifacts (UI flow diagrams, data models etc.).

> Collective ownership: This concept of Agile Modelling is used for making enhancements in the projects developed using AUP. But, it has to be assumed that the open communication is supported by the team culture. Along with supporting this concept, AUP lays strong stress on issues concerning configuration management. Therefore, the change management processes sometimes can be a hurdle in path of development.

> Parallel creation of several models: This is an important concept of UP. The team is required to check the activitiy diagrams corresponding to each discipline and see if they are being worked up on in parallel. But there is an issue with UP which is that the flow diagrams do not explain this well.

> Creation of the simple content: The simplicity is assumed by the developers. The team needs to adopt the guidelines stating the use of simple models and also the customers must be happy with this. However, many organizations often find it difficult to adopt this culture.

> Temporary models should be discarded: The AUP team is free to decide which models to discard and which models to keep. Travelling light also helps in maintaining simplicity.

> Public display of models: Models should be displayed publicly for facilitating open communication. This way all the artifacts will be available to all the stakeholders


Saturday, October 18, 2014

What is Agile Unified Process (AUP) ?

Rational Unified Process, when simplified, gives rise to AUP or Agile Unified Process. Its developer – Scott Ambler, describes it as a simple and very easy to understand methodology for development of application software for business. The agile unified process makes use of agile concepts and techniques but remains true to its origin i.e., the Rational Unified Process. Various agile techniques are employed by Agile Unified Process for developing software:
> Test driven development (TDD)
> Agile modelling (AM)
> Agile change management
> Database refactoring

All these techniques help AUP in delivering its 100 percent productivity. In 2011 the AUP was considered to be 1 percent of the whole agile methodology. In 2012 DAD or Disciplined Agile Delivery Framework superseded the AUP. Since then most people have stopped working on Agile Unified Process. AUP is different from RUP in the sense that it works only on 7 principles:
> Modelling: Involves understanding about how the business is organized around the software and the problem domain of the project, and then identifying a feasible solution for addressing the problem.
> Implementation: The model is transformed in to executable code and basic testing i.e., unit testing is performed on this code.
> Testing: An objective evaluation is carried out for ensuring that the artefact is of high quality. Testing is done for rooting out the defects and validating whether the system works as desired or not. It also includes verification of the requirements met.
> Deployment: The delivery plan for the system is laid out and executed so that the product reaches the customers.
> Configuration management: Managing the access to various artifacts of the system includes tracking the various versions at regular intervals of time, managing and controlling the changes in them.
> Project management: It includes directing the activities of the project to the right track. These activities include risk management, assigning tasks to people, tracking their progress, coordinating with people for ensuring that the system is delivered on time. This is also for making sure that the project is completed within budget.
> Environment: Involves providing continuous guidance i.e., standards and guidelines and tools for ensuring process is going on right track.

The agile unified process follows certain philosophies as mentioned below:
> The team knows what it is doing: People don’t find it convenience to read highly detailed documentation. Timely training and good guidance is accepted by all. An AUP provides links to a number of details if you want, but it does not force you.
> Simplicity: The documentation is kept as simple as possible without going into too much of detail.
> Agility: For maintaining agility, the AUP should conform to the principles and values mentioned in the agile manifesto and agile alliance.
> Emphasizing on high – value activities: Only the activities that affect the project are actually considered. Rest of the activities are not counted.
> Choice of tools: In Agile Unified Process any toolset can be used. However agile experts often recommend using simple tools appropriate for the project.
> The agile unified process can be tailored specific to the needs of your project.

There are two types of iterations in agile unified process as mentioned below:
> Development release iteration: For deployment of the project to demo – area or quality assurance.
> Production release iteration: For deployment of the project to production unit.

These two iterations are a result of refinement of RUP. The RUP’s modelling, requirement and analysis disciplines re encompassed by the disciplines of agile unified process. Even though modelling constitutes an important part of agile process, it is not the dominating factor. 


Thursday, October 16, 2014

What is agile modeling (AM)? An explanation. Part 2

Read the first part of this post (Agile Modeling: An explanation - Part 1)

The modeling should be carried forward in small increments. As such it is easy to find bugs if the increment fails or some fault occur. As an agile developer, it is your duty to continuously strive to improve the code. This is just another way of showing that your code works in real, and is not just mere theory. The stakeholders know what they want better than the developers. Thus, by actively participating in the development process and providing constant feedback they can help in building a better software overall.
The principle of assuming simplicity means keeping focus on the required aspects instead of drawing out a highly detailed sketch. It also means using simple notations for depicting various model components, using simple tools, taking information from a single source, rejecting temporary models and updating models only when required. Facilitation of the communication can be improved by making a public display of the models, be it on wall or website, application of standards of agile modeling, collective ownership of artifacts and modeling in a team. Gentle application of the patterns enhances the results drastically.
The formalized contract models are required when you require integrating your system with other legacy systems such as web application or database. These models lay out a contract between you and the owners of these legacy systems. The principles, values and practices of AM have been developed by many experienced developers. With AMDD or agile model driven development we do sufficient high level modeling in the initial stages of the project for making out the scope of the project. In the later stages the modeling is carried out in iterations as a part of the development plan. You might then take model storming road for compiling your code straightaway. Agile practices can be applied to most of the projects. It is not necessary that you should be working on an agile project to benefit from these practices. Also it is not necessary to put all principles, practices and values to use to harness agile advantage. Instead it is better to tailor these practices according to the specifications of your project.
Another way of harnessing agile benefits is to follow AMDD. MDD (model driven development) in its agile form is called agile model driven development. In AMDD before writing the source code we create extensive agile models. These models are strong enough for carrying forward all the development efforts. AMDD is a strategy that seeks to multiply the scale of agile modeling. The main stages in this process are:
> Envisioning: Consists of initial architecture envisioning and initial requirements envisioning. These activities are carried out during the inception period.
> Iteration modeling
> Model storming
> Reviews
> Implementation

It is during the envisioning phase that we define the project’s scope and architecture. It is done using high level requirements and architecture modeling. The purpose of this phase is explore the requirements of the systems as far as possible and built a strategy for the project. Writing detailed specifications is a great risk to take. For short term projects you may like to pare only few hours on this matter. Agile modelers are advised to spend only required time on this phase to avoid the problem of over – modeling. For exploiting the important requirements you might require a usage model. It helps you explore how the product will be used by the users. For identifying fundamental business entity types an initial domain model is used. The issues with the user interface and its usability can be explored using an initial UI model.


Tuesday, October 14, 2014

What is agile modeling (AM)? An explanation. Part 1

Agile modeling is one of the most trusted development methodologies when it comes to producing an effective documentation and software system. If described at a high level, it comprises of all the best practices, best principles and values required for modeling a high quality software product (this description may seem a bit hyperbolic, but it tends to be true for the most part). These practices are lightweight in implementation, with a lot of flexibility. Although, Agile Modeling is a set of principles and therefore on its own, it is of no use. It has to be mixed with other fuller technologies such as rational unified process, extreme programming, and adaptive software development, scrum and so on. This process enables us to develop software that satisfies all of our requirements. Agile modeling is governed by the following values. Some of these values are extended from extreme programming:
- Communication: All the stakeholders should maintain an effective communication between them.
- Simplicity: Developers should strive to develop the simplest solution possible meeting all the requirements.
- Humility: As a programmer, you should have a sense of humility that you may not know everything and you should allow others to add value to your ideas.
- Feedback: There should be a mechanism for obtaining feedback early in every stage of development.
- Courage: You should have courage to make decisions and stay firm.

The principles on which the Agile Modeling is based are defined by the agile manifesto. Two of these principles are to assume simplicity and embrace changes. Assuming simplicity makes it easy to design software. You are able to cut out unnecessary secondary requirements and focus on the primary needs, thereby reducing the complexity. When you embrace the fact that there will be changes in the requirements, it adds flexibility to your project. As a result, you can develop more flexible projects that can adapt to the changes in requirements, and other changes, over time. 
The software evolves in increments. You should know that it is this incremental behavior that maintains agility of the system. The requirements are ever changing and so there should be rapid feedback mechanism in place through which early feedback can be obtained. With this early feedback it becomes easy for you to ensure that your system is fulfilling all the needs. The modeling should be done with a purpose i.e., if you don’t understand the purpose of your project, its audience or its environment you should avoid working on it unless you are pretty confident. 
It is always wise to have a contingency plan. Therefore, it’s good to have multiple models on standby. There can be a situation in which your primary model might fail, the standby models will provide a back up. One thing worth noting is that agile models are not just mere documentation; they are light weight realizations of your system’s purpose. Once the purpose is fulfilled, the models are discarded. 
One belief of agile developers is that representation is less important than the content. It is the content that matters. Also there are a number of ways in which the same content can be represented. Focus should be maintained on quality work because sloppy work is not valued anywhere. Also adapting to the principles of agile modeling for meeting the environment needs is important. Modeling in an agile manner requires practice. Agile modeling can be applied through various practices. You have to pick the most appropriate one for your project. However there are few fundamental practices that are always important for success of an agile model:
> Parallel creation of several models.
> Application of the right software artifacts depending up on the situation.
> Moving forward through continuous iteration.
One word of caution though! These models are just an abstract representation of the actual systems and therefore cannot be accurate.


Monday, October 13, 2014

Bug fixing: Ensuring that there is regular contact with outside users

Bug fixing is an integral part of the product development process, and unless the bug finding and bug fixing process starts tapering down near the end of the product development cycle, you are in deep trouble. Near the end of the cycle, is one of the most stressful time for the product development team. After all, when you are nearing the end of the development cycle, any new defect that comes through can cause huge problems for the team and for the schedule. If the defect was serious, then there are many issues that come to light. One needs to figure out whether the area was tested well enough before during the earlier parts of the schedule, as well as figure out what the risk was in terms of making the fix of the defect. If the fix can impact a large enough area, then the team might well want to not make the fix and take the defect. If a serious defect fix has been made near the end of the schedule, then there is a compromise that needs to be made in terms of the amount of time available for testing of the fix.
But this post was not about this problem. The problem is more of a derived problem, In my experience with multiple development cycles, I have seen defects come near the end, or become more serious near the end, where the defect was seen earlier with external users (these are users who are not part of the core team, but could be users who are part of a restricted pre-release or a public release). The challenge comes in terms of recognizing these defects as valid defects.
The advantage of outside users is the amount of diversity that they have. People outside the core team who are testing the product do a lot more adhoc testing, trying out different combinations that the core team would not be testing. In addition, the diversity of equipment that outside users have far strips the diversity of equipment available with the core testing team - they would have more types of machines, different sets of equipment, and so on. This provides an advantage that the core team would be well equipped to use.
At the same time, there is a cost associated with the pre-release users. Some of the defects that they find are already known to the core team, other defects may not be able to be replicated by the core testing team, and yet others are significant defects that the core team takes up for fixing. For a number of defects, it may have been critical for the outside user, but the core team would make a choice that either the defect was not been able to be replicated, or the area was not of a high priority to be taken up for fixing.
However, this is where we had problems and had to make changes. Near the end of the cycle, we would find some defects during the final stage of adhoc testing that had been found by outside users but which the team had dismissed. The amount of impact to the schedule and increasing the stress level of the team was one of the byproducts. To control this problem, we decided to make a lot of effort to evaluate the defects raised by the outside users, including remote control of their machines to try and find the defect cause, spending much more time to analyze the defect, and we had some success in this kind of effort that we were putting. We were able to figure out some of these more serious defects and this in turn reduced the chance of getting some of these defects near the end of the cycle, and also had the byproduct where the outside users felt that their defects were getting more serious attention and produced some great defects.


Friday, October 10, 2014

What are some of the limitations / challenges of Adaptive Software Development (ASD)?

The Adaptive Software Development (ASD) culture is the result of efforts of Sam Bayer and Jim Highsmith in the field of rapid application development. The methodology aims at developing software that is capable of adapting continuously to the changes in the working environment. In ASD, in place of the waterfall approach, we have cycles of speculating, collaborating and learning. It is because of this dynamic cycle that the software is able to adapt to changing state of the requirements and learn through it. The cycle is very much tolerant to changes, driven by the risk, timeboxed and works in iterations.
Throughout the process the ASD life cycle remains focussed on the mission. Since adaptive software uses the information available from the environment for improving its working pattern, with the increasing complexities it becomes difficult for it to gather usable information. The effectiveness of the adaptive software development is reduced by the complexities of the environment. Today we expect more from the software and in such critical situations in which we never expected earlier. Thus complex environments pose a great challenge. There are three dimensions that contribute to this complexity:
> Increasing number of users: Now not only professionals, everyone uses software.
> Increasing number of systems: More number of systems means more number of interactions between them. Most of the system networks that we have now are heterogeneous. Maintaining homogeneous networks is easy.
> Increasing number of resources and goals: The most common trade off that programmers make is between time and space. Now there are several other things to worry about including security, bandwidth, money, quality, resolution and so on.

These three dimensions make it even hard for the designers to design a system. It is impossible to predict about these factors and therefore always right decisions can’t be made. This results in a product with a short lifetime. Every now and then upgrades will be required for modifying the software. Other factors related to complex environment that pose a challenge for adaptive software are:
> uncertainty
> hidden inputs
> non – deterministic
> unique
> continuity
> real world

Other things that put limitations on adaptive software development are following 4 myths:
> The traditional belief is that the specifications must be determined first. But this is not the case always. One specification can be taken at a time and refined in later stages. The aim should be combining several components together successfully and not developing a single app.
> People usually believe by maintenance that program code has degraded. The truth is that it remains the same while its environment changes. So maintenance involves evolution of the code to satisfy these changing needs. When programmers view this through a maintenance perspective, they tend to preserve the old program structure.
> It is undeniable that abstraction has an important role to play in the development process. But when we treat a process as a black box object we are actually ignoring the practical problem it faces of resource usage. In adaptive software development we take a new approach called open implementation. Here the modules have two interfaces – one for input/ output and the other checking performance. These two interfaces are perpendicular to each other. ADS also adds a feedback mechanism to this orthogonal interface system making it much better.
> While designing software we consider all the possible alternatives. The desired one is kept and the others are rejected. This means that once a requirement changes, we again have to see what alternatives are available and which one can be implemented. This might require changing all of the program structure. The best thing here to be done is programming the system such that it reconfigures itself.


Thursday, October 9, 2014

How is Adaptive Software Development (ASD) different from Scrum? Part 2

Read the First Part (Adaptive Software Development being different from Scrum - Part 1)

For understanding the further differences between the two, it is important that we know what Agile Development is. The Agile manifest defines the agile methodology. There are 7 agile methodologies; namely XP, Crystal orange, Scrum, Adaptive Software Development, DSDM, Pragmatic programming and Feature – driven development. All these methods differ in their mechanisms and the parameters they take. All these methods have a different agile life cycle. For ASD, it depends on what techniques we are going to use. Generally speaking it doesn’t have a life cycle of its own. In contrast to this, scrum has an abstract lifecycle which packs certain activities in to its schedule.

However here we discuss differences based up on a lifecycle having some general phases.
- Project initiation: This step includes justification of the project and determining the requirements.
- Planning: Laying out an elaborate plan for development and leaving some gap for contingency actions. Scrum doesn’t have choice for including optional phases. It must work within predefined options, whereas the adaptive software development can have many options because it does not limits itself to few techniques.
- Elaboration of requirements: This stage is optional. The requirements are explained in great detail. Scrum does not implement this stage separately, but, it may be done in Adaptive Software development. Since the requirements are still in high level, they need to be broken down in to simpler specifications. The requirements are collated in to requirements document after classification. This document also contains use cases apart from the requirements.
- Architecture: Both scrum and ASD are based on agile principle of, design today and refactor tomorrow. Software developed using ASD can be modified to suit the changing environment just by adjusting their software capabilities and leaving the hardware unchanged. Software developed through scrum has to be modified through manual intervention and may even require to change hardware. A system architecture document is prepared in this phase.
- Release: The software is released to the customer. It has one timebox at least. Scrum can have many releases depending upon the development span of the project. Adaptive software usually delivers product in one go.

One of the important questions to be asked during project initiation phase is that whether to invest more or not? Not all methods answer this question. However addressing this question is an important part of the adaptive software development. Scrum doesn’t address this question explicitly. Next step is choosing the appropriate method. Scrum claims that it can be used for any project. Adaptive software development creates a list of all alternatives that can be used and chooses the best one.
The level of formality to be maintained in scrum is given by documents such as running code and backlog. In adaptive software development there are many too many choices to choose from. It makes use of vision statements for clearing out the confusion. Scrum defines a sprint goal which is a kind of timebox vision i.e., it follows one choice for some time and then can change if required.
Scrum avoids elaboration phase for speeding up the delivery. It uses product backlog maintained by the product owner.
Since adaptive software development may have an elaboration phase, it may also have a draft plan initially which may turn into a complete plan after determination of requirements. Scrum plans based on timeboxes.
Both methodologies help you work faster creating products with better quality. The agile customer too is an important role in the agile process. The customers are responsible for providing input during prototyping sessions. The organizing and controlling of user testing is the responsibility of the user. 


Tuesday, October 7, 2014

How is Adaptive Software Development (ASD) different from Scrum? Part 1

Adaptive software development and scrum both come under the overall heading of agile software development, but are two different beasts with differences between them. These are two different approaches with the same aim i.e., agile development. Below, we will write more on what differentiates the two.
It is necessary to know the differences between these two approaches since Scrum, as a part of agile software development is already a very famous methodology, whereas the adaptive software development is relatively new, but also a strong upcoming methodology ready to take the software development market by storm. Agile methods are always preferred over traditional development approaches to incorporate major factors such as increased levels of customer participation and satisfaction, project delivery flexibility, and acceptance of ongoing requirements or design changes at any level in the development process. This is the usual development scenario and that is why agile development methods are used mostly.  Most commonly used agile methods include:
-  Adaptive software development
- Scrum
- Feature driven development
- Extreme programming

Even though all these strategies aim at the same set of objectives (well almost), they do take somewhat different paths (having some items common, and yet others are different). It is important to know the differences between them. Otherwise how you are going to maintain agility in your development structure? Comparison between different agile methodologies, their common points and differences lets the managers choose the most appropriate agile method for their software characteristics.
There are many traditional techniques for drawing comparisons between 2 software development techniques. In this post, Scrum and ADS have been compared on basis of 2 areas, namely software construction and software requirements.
First difference between adaptive software development and the scrum approach is that in the former, there is no predefined technique to be followed. In scrum there are certain predefined techniques to be followed. On the other hand the focus of adaptive software development is on the end result rather on the steps. Adaptive software development methodology implements a particular technique only if it is required.
Since the last two years agile principles have encompassed methods such as extreme programming, feature driven development, adaptive software development and so on.  So far companies using scrum or adaptive software development have been able to deliver successful products. Scrum is described as a process for management and control used for cutting down the complexity. This is used by the organizations who want to deliver products faster. This approach to software development has been used on advanced development methods including software; it usually works well in the object – oriented development environment.
On the other hand adaptive software development is described as methodology for developing reliable and re – adaptive software. It is used by the organizations who want their products to function in a wide range of changing conditions. Software built with this approach are capable of reconfiguring themselves to better suit to the environment. It can be used under any kind of environment. Any technique can be applied here to make software as much flexible as possible. The owners are responsible for estimating backlog items in scrum. In adaptive software development it is mainly the task of the developers.
Scrum follows time-boxing method for speedy delivery. At the start of each timebox, a sprint planning meeting is held. A backlog that can be achieved is picked up by the team members and they decide how it can be achieved.
Scrum may have 1 to 4 teams with 5 – 9 members and adaptive software development may have even more number of teams dedicated to each phase of development process. Agile developers play a number of roles – analysing, estimating, designing, coding and testing. Even though the typing burden is less, there is a lot of stress because of the short frames. But you get to see your progress through user reviews and daily meetings.

Read next part  (Adaptive Software Development being different from Scrum - Part 2)


Saturday, October 4, 2014

What is Adaptive Software Development (ASD)?

There are many a software development processes, with adherents and proponents for each one of them. Here we discuss about one of them i.e., ASD or Adaptive Software Development. This process is a result of Sam Bayer’s and Jim Highsmith’s efforts concerning RAD process i.e., rapid application development. ASD is based up on the principle that the development process must be continually adapted to the work at hand as a normal affair.
This development process is sometimes used as a replacement for the traditional waterfall approach. The steps in the waterfall model are replaced with repetitive sets of cycles of speculation, collaboration and learning. This cycle is quite dynamic in nature and allows us to maintain a continuous learning process. It also makes it easy to adapt to the project’s next state. An ASD life cycle is characterized by the following characteristics:
• Mission – focused
• Feature -  based
• Iterative
• Timeboxed
• Risk driven
• Tolerant to changes

Here by speculation we mean “paradox of planning”. Adaptive software development assumes that in the mission of a project, fault in certain aspects of it can be traced to all stakeholders’ mistakes. These mistakes are usually made while defining the requirements of the project. Efforts concerning the maintenance of work balance and environment and adapting to the changes in the environment (caused mainly due to requirements, technology, software vendors, stakeholders and so on) come under collaboration.
The learning cycle on the other side consists of many short iterations involving designing, building and testing. The aim of these iterations is to gather knowledge based up on small mistakes made and based up on false assumptions and then correcting them. This leads to great experience in the domain.
A lot of money and time goes into developing a piece of software, but it still proves to be brittle in some unexpected situations. Then how does adaptive software development makes it better? As the name suggests, ASD is focused up on development of such programs that are capable of readily adapting in the event of changes resulting from user requirements and development environment. It includes an explicit representation of the actions that the program can take and also the goals of the user. With this it becomes possible for the user to change the goals without rewriting the code. It is often used for producing what we call as the agent based applications.
We also have object - oriented programming and structured programming methodologies for developing software. With object – oriented approach the reorganization is easy with the changes because here we divide the functionality in to different classes. But still it requires the programmer to intervene to make some change. Therefore, this approach is kept confined to developing user – initiated event based applications. The structured programming was used to develop input/output based applications only since it cannot tolerate any changes in the specifications. The database managing programs are a typical example of this kind of applications.
There are several other methodologies for developing software. But they only provide solution for managing the changes rather dealing with them. It’s only adaptive software development that gives a real way of adapting to the change and dealing with it without having a programmer’s intervention. The best thing about adaptive software development is that it collects information about the environmental changes and uses it for improvement of the software. But today’s complex software and operating environments are making it less effective. Factors such as increasing number of users, systems, number of interactions, resources and goals etc. lead to complexity of the software. Now apart from time and space, programmers have to look out for money, security, resolution and so on.


Friday, October 3, 2014

What is Boolean satisfiability problem?

The Boolean satisfiability problem is often abbreviated to SAT. In the field of computer science the Boolean satisfiability problem is concerned with the determination of whether or not an interpretation exists, satisfying some given Boolean formula. The other way round, we can say that it is established when in a given Boolean formula; the variables can be assigned in such a way that upon evaluation, the value of the formula is TRUE. If any such assignment is not found, the function that the formula expresses is FALSE for all the ways in which the variables can be assigned. In this case, the problem is said to be unsatisfiable and satisfiable in the former case. Consider the following examples for a better understanding:
- X AND NOT y: This problem is satisfiable since we can find many values for which x is true and y is false.
- X AND NOT x: This problem is unsatisfiable since ’a’ cannot have two values at the same time.

This Boolean unsatisifiability problem actually falls under the category of the NP – complete problems. Till date we don’t have any algorithm that can effectively solve all types of such problems and it is also believed that there is no such algorithm. However there exists a number of problems involving decision and optimization that can be transformed into the SAT instances. The algorithms that have been used to solve a large subset of these problems have been grouped in a class called the SAT solvers. The application of the SAT instances is in the following fields:
- Circuit design
- Automatic theorem proving

Extensive research has been going on for extending the capabilities of the algorithms for solving SAT. but, presently we don’t have any such method that can deal with all the instances of the SAT. A Boolean expression also called as the propositional logic formula is composed of the operators: AND, OR, NOT, () etc. and variables. A formula can be called satisfiable only if it evaluates to be TRUE through assigning the variables appropriate logical values i.e., whether TRUE or FALSE. This formula is used for checking the satisfiability of a SAT instance. Another type of problems falling under this category is decision problem. This type of problems plays a major role in the following fields:
- Theoretical computer science
- Complexity theory
- Artificial intelligence and
- Algorithmics

We have many cases of the SAT problems where we need to have formulas with a certain structure. A variable or a literal is of two types namely the positive literal and the negative literal (negative of a variable). Then we have a clause two which is nothing but a single literal or a disjunction of the literals. If a clause consists of 0 or 1 positive literal, it is said to be a horn clause. If a formula is a conjunction of the clauses, it is a CNF (conjunctive normal form) formula.
It has been observed that defining the notion of generalized CNF formula is useful for solving some instances of the SAT problems. Different problem versions occur due to the different sets of the allowed operators. The Boolean algebra laws can be used for transforming every propositional logic formula in to its CNF equivalent form which might be longer than the one in its original form. SAT problems were the first problems discovered under the field of the NP- complete problems. Before 1971, this concept didn’t even exist. If the formulas can only be used in the disjunctive normal form, SAT is said to be trivial. These formulas are satisfiable iff only one of their conjunctions can be satisfied.


Thursday, September 25, 2014

Which languages support logic programming?

In this article we discuss about the languages supporting the logic programming. There are many such languages and we divide them in to the following two categories:
- Prolog programming language family
- Functional logic programming languages

First we shall discuss some languages falling under the first category:
1. Algebraic logic functional programming language or ALF: Combines the techniques of both the logic programming techniques and the functional programming techniques. This language is based on the horn clause logic. The resolution rules for solving the literals and evaluating the functions form the foundation for the operational semantics of this language. It follows a left – most inner – most narrowing strategy for reducing the number of steps in solving a problem. Thus, these operational semantics are much more efficient and powerful than those produced by the resolution strategy of prolog.
2. Alice ML: This programming language was designed at the Saarland University and is actually a standard ML dialect. But it has the support for concurrency (this includes distributed computing as well as multithreading), lazy evaluation, constraint programming etc. this language uses a relation concept called “promise” according to which a future value provided by one thread will be computed to another thread. Thus data flow synchronization is possible in Alice ML by use of future typed variables and promise concept.
3. Ciao
4. Curry
5. Leda: It stands for ‘library of efficient data types and algorithms’ and is a multi – paradigm – programming language. The language is used for mixing the features of logic based, object based, functional and imperative programming in to one.
6. Mercury
7. Metal
8. Mozart
9. Oz: This one is another multi – paradigm programming language for the purpose of programming language education. It was developed in the year of 1991 at the Swedish institute of computer science. A primary example of Oz implementation is the Mozart Programming system. The system comes with an open source language and supports many platforms including Microsoft windows, mac os x, Linux, Unix etc.
10. Visual prolog

Now we shall see the second category of languages i.e., the functional logic programming languages:
1. B – prolog: This is the high – level implementation of the prolog and has some additional features such as the event handling rules, matching clauses, arrays, declarative loops, tabling, constraint solving etc. it was developed in 1994 and now is a popular CLP system. Though B – prolog is a commercial product, it comes free for research purposes. A clause in which the input/ output unifications and the determinacy are denoted in an explicit way is called a matching clause.
The matching clauses are translated in to the respective trees by the compiler and indexes are generated. This compilation is easier when compared to the compilation of the normal clauses in prolog. B – prolog overcomes the absence of active sub goals programming facility in prolog by introducing the action rules or AR which is a powerful but simple language for serving this purpose. The sub –goals are called agents. Activation of an agent is followed by the execution of an action. The CHIP system heavily affected the finite domain solver of B – prolog.  For creating arrays, a built – in is provided by B – prolog called new_array(X, Dims) where x stands for uninitialized variable and Dims for positive integers for specifying the array dimensions.
2. Eclipse
3. GNU prolog
4. Jprolog
5. KL0 and KL1
6. Logtalk
7. Objlog: This one is a frame based language that combines two things: Prolog II and object from CNRS.
8. Prolog
9. Prolog++
10. Strawberry prolog
11. Tuprolog
12. Visual prolog
13. YAP


What is a programming paradigm?

The fundamental style in which you design and write computer programs is called a programming paradigm. It’s the way you build the elements and structures of your programs. Presently we have the following 6 kinds of programming paradigms:
- Imperative
- Declarative
- Functional
- Object – oriented
- Logic
- Symbolic

Just as the different methodologies define the software engineering, different paradigms define the programming languages. Different languages are designed for supporting different paradigms. For example,
- Object oriented paradigm is adopted by smalltalk
- Functional programming is adopted by Haskell

Though there are certain programming languages that support only one kind of paradigm, there are many others that can work with multiple paradigms. Examples are:
- Object pascal
- C++
- Java
- C#
- Scala
- Visual basic
- Common lisp
- Scheme
- Perl
- Python
- Ruby
- Oz
- F#

The pascal and C++ programs can either be purely object oriented or purely procedural or may contain features of both. It is up to the programmers, how they want to use the paradigms. In object oriented programming, the program is considered as a collection of objects that can interact with each other. Whereas, in functional programming the program is considered as a sequence of the evaluations of stateless functions.
The process – oriented programming helps in designing the programs as a set of processes running concurrently in systems with multiple processors. The shared data structures are used by these processes. Some techniques are allowed by programming paradigms while others are forbidden. For example, the use of side effects is forbidden while using pure functional programming. The use of the dangerous goto statement is not allowed in structured programming. Because of this the modern programming paradigms are considered to be overly strict when compared to their older counterparts.
Proving the theorems can be easy if you avoid the use of certain techniques or if you understand the programs’ behavior. Sometimes programming models are compared with the paradigms. The abstractions of the systems are called systems. An example is the von Neumann model which is used in the sequential computers. There are a number of models for computers using parallel processing.
 Most of the models are based upon message passing, shared memory, hybrid etc. machine code and the assembly language instructions are the programming paradigms of the lowest level. The assembly language makes use of the mnemonics for operations. They are often referred to as the first generation languages. The assembly language is still in use for programming the embedded systems for obtaining direct control over the machine. The next generation of the languages is represented by the procedural languages. Examples are cobol, fortran, algol, PL/I, basic, c etc. procedural paradigm is adopted by all these languages.
The experience and ability of the programmer affects the efficiency and the efficacy of the problem’s solution. Later, the object oriented languages came in to the scenario such as the smalltalk, simula, java, Eiffel, C++ etc. These languages use objects (data and functions or methods for handling it) for modeling the real world problems. The object’s methods are the only way through which the user can access the data. Object oriented paradigm has led to the possibilities of creation of the object – oriented assembler language. For example, the HLA (high level assembly) is such a language that provides support for the advanced data types. In declarative programming paradigms, the problem is told to the computer but not the method. The program thus consists of properties that can be used for finding the result that is expected. The program is not a procedure. 


Wednesday, September 24, 2014

What is a reasoning system?

In the field of information technology, a software application which is used for the purpose of drawing conclusions from the knowledge that is available is called a reasoning system. The reasoning systems work on the principles of logical induction, deduction and some other reasoning techniques. These systems fall under the category of more sophisticated systems called the intelligent systems. Reasoning systems have a very important role to play in the fields of artificial intelligence and knowledge engineering.
In these systems, the knowledge that has been acquired already is manipulated for generating new knowledge. By knowledge here we mean symbolical representation of the propositional statements and facts based on assumptions, beliefs and assertions. Sometimes knowledge representations that are being used might be connectionist or sub – symbolic. An example of this is a trained neural net. The process of inferring and deriving new knowledge via logic is automated by means of the reasoning systems. The reasoning systems provide support for the procedural attachments for application of knowledge in a situation or a domain. Reasoning systems are used in a wide range of fields:
- Scheduling
- Business rule processing
- Problem solving
- Complex event processing
- Intrusion detection
- Predictive analysis
- Robotics
- Computer vision
- Natural language processing
As we mentioned above, logic is used by reasoning systems for generating knowledge. However there is a lot of difference and variation in usage of different systems of logic. This is also affected by formality. Symbolic logic and propositional logic is used by majority of the reasoning systems. The variations or the differences demonstrated are usually the FOL (formal logic systems) representations, their hybrid versions, extensions etc.  that are mathematically very precise.
There are other additional logic types such as the temporal, modal, deontic logics etc. that might be implemented explicitly by the reasoning systems. But, we also have some reasoning systems for the implementation of the semi – formal and imprecise approximations to the logic systems that are already recognized. A number of semi – declarative and procedural techniques are supported by these systems for modeling various reasoning strategies.
The emphasis of the reasoning systems is over pragmatism rather than formality. It also depends on attachments and other custom extensions for solving the real world problems. Other reasoning systems make use of the deductive reasoning for drawing inferences. Both the backward reasoning and forward reasoning are supported by the inference engines in order to draw conclusions through modus ponens. These reasoning methods are recursive and are called as backward chaining and forward chaining respectively.
Though majority systems use deductive inference, a small portion also uses inductive, abductive and defeasible reasoning methods. For finding acceptable solutions for intractable problems, heuristics might also be used. OWA (open world assumption) and CWA (closed world assumption) are used by reasoning systems. The first one is related with the semantic web and the ontological knowledge representation. Different systems have different approaches towards negation.
Apart from the bitwise and logical complement, other existing forms of negation both strong and weak (such as inflationary negation, negation – as - failure) are also supported by the reasoning systems. There are two types of reasoning that are used by reasoning systems namely monotonic reasoning and non – monotonic reasoning. There are many reasoning systems that are capable of reasoning under uncertainty. This is very useful particularly in situated reasoning agents that are used for dealing with the world’s uncertain representations. Some common approaches include:
- Probabilistic methods: Demster – Shafer theory, Bayesian inference, fuzzy logic etc.
- Certainty factors
- Connectionist approaches
Types of reasoning system are:
- Constraint solvers
- Theorem provers
- Logic programs
- Expert systems
- Rule engines


Tuesday, September 23, 2014

What is concurrency control with respect to databases?

Concurrency control is a technique applied in the fields of operating systems, databases, computer programming, and multi processing etc. for ensuring that concurrent operations produce the correct results in as less time as possible. Both the hardware and the software parts of the computer are made up of smaller modules and components where each of them is designed and programmed respectively to work correctly according to some consistency rules. There is concurrent interaction between these components through messages, shared data, which can lead to a result which is in violation of those rules.
The basic idea of concurrency control is to provide methodologies, theories and rules for enforcing consistency in the whole system. Implementation of concurrency control reduces the performance because we apply some constraints on the components which does have the effect of reducing the overall speed. However one thing that should be taken care of is to achieve consistency with as much efficiency as possible and without reducing the performance below minimum levels.
The drawbacks of concurrency control include additional complexity and generation of more overhead in using a concurrent algorithm. The concurrent algorithms generate more overhead when compared to their sequential algorithm counterparts. If the concurrency control mechanism fails, it can lead to torn read and write operations and can corrupt the data.
In this article we talk about concurrency with respect to the databases. Concurrency control is implemented in DBMS, distributed applications etc. for ensuring that the concurrent data transactions are accomplished without causing damage to the data integrity. Distributed applications include cloud computing and grid computing. Concurrency control is also used in some other transactional objects.
It is an essential part of the systems where two or more transactions overlap over the same time instant and can operate on the same data. This happens in almost any general purpose database management system.
Since the advent of database systems, research has been going on related to this concept. Serializability theory is the best established theory that helps define the concept of concurrency control.  This theory also lets us in designing as well as analyzing the concurrency control methods as well as mechanism as effectively as possible.  There is another theory that does not emphasize upon concurrency control over the abstract data types but rather over atomic transactions. However this theory though having a wider scope and more refined, it adds more complexity to the system. Both the theories have their advantages and disadvantages. Merging these two theories might help because they are complementary to some extent.
For ensuring proper concurrency control and correct execution of the transactions, only the serializable transactions and schedules are generated by the system and executed. In some cases, the serializability might be relaxed intentionally by the system for increasing the performance. But, this is done only in those cases where it won’t generate incorrect output.
There are many cases when the transactions fail. Here the system needs to have a recoverability property for recovering from the damage. A good database system also ensures that the result of the transactions that have committed is not lost if the system is switched off accidentally or crashes. On the other hand it also ensures that the incomplete results of the aborted transactions are erased and the actions are rolled back. The ACID rules (mentioned below) characterize the transactions:
- Atomicity: Each thread consists of a single transaction.
- Consistency: This characteristic depends largely on user.
- Isolation: Every transaction should be executed in isolation i.e., should not interfere with others.
 - Durability: The results of the committed actions should persist.
 Nowadays as, database systems are becoming more distributed, the focus is more upon the distribution of the concurrency control mechanism.  


Saturday, September 20, 2014

What are some concurrency control mechanisms (with respect to databases)?

In this article we discuss about different types of concurrency control mechanisms that are implemented in databases.
- Optimistic: This type of concurrency control method delays transaction checking. It does not check the integrity and isolation rules such as the recoverability, serializability etc. of the transaction until it has completed executing all of its associated operations. Also, any of the operations of the transaction are not blocked. If, upon commitment the transaction’s operations are violating these rules, the transaction is aborted. Immediately after being aborted, this transaction is executed again though this generates a restart and re – execution overhead. This mechanism is good to follow if there not too many abortions.
- Pessimistic: The concurrency control methods falling under this category block a transaction from carrying out any transaction if it is suspected to violate the rules. The transaction is not allowed to execute until the probability of rule violation becomes zero. However such prolonged blocking of the transactions reduces the performance drastically.
- Semi – optimistic: These concurrency control mechanisms consider blocking the transaction only in some situations where it is important to do so. Transactions are not blocked when rules are being checked as in optimistic concurrency control mechanisms.

The performance of these different types of concurrency control mechanisms is different. By this we mean they all have different throughputs (rate of transaction completion). This depends on various factors such as the level of parallelism, transaction types mix etc. The trade – offs between the various categories should be considered and the one providing highest performance in the particular situation should be chosen. Two transactions mutually locking each other result in a deadlock. In such a situation the involved transactions go on waiting forever and are not able to complete. The concurrency control mechanisms that are non – optimistic are observed to have more deadlocks. The transactions have to be aborted for resolving the deadlocks. All this deadlocks, blocking, resolving introduces delays in performance and these are the major trade – off factors between the types. Below we mention some major concurrency control methods which many variants falling under the above mentioned categories:
- Locking: Its variants include the two – phase locking (2PL). This mechanism facilitates the access to the locks on data acquired by the transactions. If a transaction tries to acquire a lock over a piece of data already locked on by another transaction, it is blocked till the latter transaction releases its lock. This however depends on the type of access operation and type of lock.
- Serialization graph checking or precedence graph checking: This mechanism checks out for any cycles in the graph of the schedule and if found breaks them by aborting the involved transactions.
- Timestamp ordering: The timestamps are assigned to the transactions and access to data is kept under control by constant checking and time stamping.
- Commitment ordering: The transactions are checked in the order of their commitment so as to maintain their compatibility with their precedence order.

There are some other concurrency control methods are used along with the above mentioned types:
- Index concurrency control: The access operations are synchronized with the indexes instead of synchronizing with the user data. Performance can be gained using specialized methods.
- Multi – version concurrency control or MVCC: Each time an object is written, it generates a new copy of that object so that the other transactions can still read the object. This increases concurrency without compromising with the performance.
- Private workspace model: A private workspace is maintained by each transaction for accessing the data. Any changes made to the data become visible to the outside transactions after the transaction commits. 


Friday, September 19, 2014

Best practices for concurrency control with respect to databases?

Today almost all the service – oriented businesses have grown highly dependent on reliable and speedy access to their data. Most of the global enterprises need this access to databases on a 24x7 level without interruptions. These reliability, availability and performance needs of the organizations are met by database management systems (DBMS). Thus, a DBMS is responsible for two things. Firstly, for protecting data that it stores and for providing correct, reliable and all – time access to this data. The concurrency control and recovery mechanisms of DBMS are responsible for carrying out these functions properly.
Concurrency control mechanism ensures that you get to see the execution of only your transaction even though 100s of users are accessing the database at the same time. The recovery mechanism ensures that the database is able to recover from all the faults. It is because of the existence of these functionalities that the programmers feel free to add new parts to the system without having much to worry about. A transaction is nothing but a unit of work which consists of several operations and updates. Every transaction is expected to obey the ACID rules. In this article we discuss about some best practices for concurrency control.
- Two phase locking: Locking is perhaps the most widely used technique for maintaining control over concurrency matters. This mechanism provides two types of locks namely, the shared lock (S), and the exclusive lock (X). The compatibility matrix defines the compatibility of these two locks. According to the compatibility matrix, S locks can be held by two different transactions at the same time but this is not possible in the case of X locks for the same data item. With this policy multiple read operations can be carried out concurrently. In other words read access to an item is protected by the S locks. On the other side, the write access is protected by the exclusive locks. In simple words, no other transaction can obtain a lock on a data item which has already been locked by another transaction if the two locks are conflicting. A transaction requesting for a lock at an instant when it cannot be granted, it is blocked by the mechanism until the other transaction releases its lock.
- Hierarchical locking: Practically, the notion of the conflicting locks works at different levels of granularity. Deciding for proper granularity for locking an item generates a locking overhead and might interfere with concurrency. Locking at the granularity of a tuple (one row) allows the system to keep concurrency at the maximum level. The disadvantage of this locking mechanism is that for a transaction to access multiple tuples, it needs to lock all those tuples. This will require issuing same number of calls to the lock manager generating a substantial overhead. This can be avoided by considering a coarser granularity but at the expense of more false conflicts.

The two phase locking is categorized under the pessimistic technique as it assumes that there will be interference among the transactions and takes measures against the same. Optimistic concurrency control provides an alternative to this. With this, the transactions can carry out the operations without having to acquire locks. For ensuring that there is no violation of serializability, a validation phase is performed before the transactions can commit. A number of optimistic protocols have been proposed. The validation process makes sure that the read and write operations of two transactions running concurrently do not conflict. If such a conflict is determined during validation, the transaction is immediately aborted and forced to restart. Thus, for ensuring isolation, optimistic mechanisms rely on restarting the transaction whereas the locking policies use blocking strategy.


Wednesday, September 17, 2014

What are some challenges with respect to database concurrency control?

During the sequential execution of the transactions, the execution time periods do not overlap each other and therefore the transaction concurrency does not exist. But if we allow interleaving of the transactions in a manner that is not controlled properly, we are bound to get undesirable results. There are many situations in which concurrency of a database might be harmed.
In this article we discuss such challenges. The transaction models based upon the ACID rules have proved to be quite durable in their course of time. They serve as a foundation for the present database and transaction systems. Many types of environments, parallel as well as distributed have used this basic model for implementing their own complex database systems. These environments however require some additional techniques such as the two - phase commit protocol. Though this model has been a great success, it suffers from few limitations.
The model lacks flexibility and thus is not able to model some particular kinds of interactions between the organization and the complex systems. Also in collaborative environments, a piece of data cannot be strictly isolated even if it is desirable. Also since the ACID transaction model suits well for the systems with short and simple transactions, it is not so appropriate for the workflow management systems.
Such systems require rich transaction models with multi – level notion. For other environments for e.g., the mobile wireless networks also this model does not suffice. In such environments expectations of having large disconnection period are higher. Then we have the internet which is a loosely coupled WAN (wide area network) which too cannot fully adopt ACID model because the availability is low. We require techniques which can help the ACID model in adjusting to such extremely varying situations.
Research is going on new techniques that help in maintaining concurrency control as well as recovering in dissemination – oriented environments, heterogeneous systems and so on. Another problem is that this model is not capable of exploiting the data and application semantics through a general mechanism. Knowledge about this can help a great deal in improving the performance of the system by a significant margin. A separate research has been going on over the subject of recovery and concurrency control. The DBMS’ recovery component is responsible for durability as well as atomicity of the ACID transactions. Distinguishing between the volatile storage and the non – volatile storage becomes absolutely necessary. The below mentioned three types of failure pose a challenge for the proper working of a DBMS:
- Transaction failure: In some cases it happens that the transaction during execution reaches a state from where it cannot commit successfully. In such cases all the updates made by that transaction have to be erased from the database as a measure of atomicity preservation. This is called transaction rollback.
- System failure: A system failure often causes a loss of volatile memory contents. It has to be ensured that the updates made by all the transactions before the occurrence of the crash persist in the database and the updates made by the unsuccessful transactions have been removed.
- Media failure: In these failures, the non – volatile storage gets corrupted which makes it impossible to recover the online version of the data. Here the option is to restore the database from an archive and using operation logs all the updates must be made.
Recovery from the last kind of failure requires using other additional mechanisms. For all this to take effect, the recovery component has to be reliable. Flaws in the recovery system of a database put it to a high risk of losing data.


Tuesday, September 16, 2014

What is 2 phase locking with respect to database concurrency control?

2PL or Two – phase locking is the most widely used concurrency control method. This method is used for implementing the serializability theory. The protocol makes use of two types of locks that the transactions apply to data which leads to blocking of the other transactions from accessing the same data item during the execution period of that transaction.
The 2PL protocol works in the following 2 phases:
- The expanding phase: In this phase the locks are only acquired and there is no releasing.
- The shrinking phase: In this phase the locks are only released and not acquired.

Shared locks (S) and exclusive locks (X) are the two types of lock used by this protocol. However many refinements have been produced of this protocol which utilize more than one type of lock. Since 2PL blocks processes using locks, it can lead to deadlocks as a result of the blocked transactions. SS2PL (strong strict two – phase locking) is combined to form the 2PL and is also known as rigorousness. It is mostly used for maintaining concurrency control in the database systems. The protocol has a number of variants, the most common being the strict 2PL, which is a combination of 2PL and strictness. A schedule that obeys the protocol is said to be serializable.
In a typical transaction, when the phase – 1 of transaction ends and there is no explicit information available, it is said to be in a ready state i.e., it can commit now without requiring any more locks. In such cases, we can end the phase-2 immediately or sometimes it might not even be required. In other cases where more than one processes are involved, we determining the end of phase – 1 and begin releasing of the locks with the help of a synchronization point. If this is not done, we violate the serializability and strict 2PL rules. But, determining such a transaction point is very costly and therefore the transaction end is merged with the end of phase – 1 eliminating the need of phase – 2.
Thus 2PL is turned in to SS2PL. In S2PL, the transactions must release their locks (X locks) after they have completed their write operation either by aborting or committing. The read locks (S) on the other hand are released on regular basis in phase 2.  Explicit phase – 1 end support is required for implementation of the general S2PL.
Strong strict 2Pl is also known as rigorous two – phase locking, rigorousness or rigorous scheduling and so on. Both the read and write locks are released after the completion of the transaction by the protocol. A transaction that complies with SS2PL is the one having only phase – 1 during its entire life time and no phase – 2. The class of schedules exhibiting the SS2PL property is called rigorousness. S2PL is a superset of SS2PL classes. This one has been the concurrency control mechanism choice for most of the database designers. The main advantage is that it provides strictness apart from serializability.
These two properties are very much necessary for an efficient recovery of the database as well as in commitment ordering. Global serializability and distributed serializability solutions are used for distributed environments. The down side of 2PL protocol is deadlocks. The data access operations are blocked by the locks resulting in a deadlock.  In this situation none of the blocked transactions can reach completion. Thus resolving the deadlocks effectively is a major issue. It can be resolved by aborting one of the locked transactions, thus eliminating the cycle in the precedence graph. The wait -  for -  graphs are used for detecting deadlocks. 


Sunday, September 14, 2014

What is Timestamp ordering method for database concurrency control?

Concurrency control methods with respect to the database systems can be divided in to two categories namely:
- Lock concurrency control methods: Example, 2PL, SS2PL etc.
- Non – lock concurrency control methods: Example, timestamp ordering method.
In this article we focus upon the latter one (Non – lock concurrency control methods). This method is widely used for handling transactions safely with the help of time stamps. Let us see how it operates.

The timestamp ordering mechanism makes three assumptions prior to operating:


- The value of every timestamp is unique and the time instant it represents is accurate.
- There are no duplicate timestamps.
- The lower – valued time stamp occurs before a higher – valued timestamp.

There are three methods that for generating the timestamps:


- It takes the values from the system clock as timestamp when the transaction starts.
- It uses a thread – safe shared counter as timestamp and it is incremented when the transaction starts.
- It combines the above two methods.

Formal: A transaction is considered to have an ordered list of operations. Before the first operation begins, the current timestamp is marked on the transaction. An initial empty set of transactions and empty set of objects is assigned to every transaction on which it depends and updates respectively. Two timestamp fields are given to each object which are meant to be used only for concurrency control.

Informal: A timestamp is assigned to the transaction at the very beginning. This makes it possible to tell in which order the transactions have to be applied. So when we have two transactions meant for operating on the same object, the one with the lower timestamp is executed first. However, if this transaction is incorrect, then it must be aborted immediately and restarted. The object has one read timestamp and one write timestamp, both of which are updated when the corresponding read and write operations are carried out.

Two cases are considered when an object has to be read by a transaction:
- The transaction starts prior to the write timestamp: This indicates that the data of the object was changed by something. The transaction is aborted and then restarted.
- The transactions starts after the write timestamp: The transaction can safely read the object. The read timestamp is changed to transaction timestamp.

The following cases are considered when an object has to be written or updated by a transaction:
- Transaction starts prior to read timestamp: This indicates that the object has been read by something. Assuming that the reading transaction has a copy of the data, we don’t write to it for preventing changes from being made to the copy. So we abort and restart the transaction.
- Transaction starts prior to the write timestamp: This means that the object was changed by something at the starting time of our transaction. Here we apply the Thomas write rule, skipping the current operation and then continuing as normal. Aborting and restarting is not required.
- The transaction changes the object and its timestamp is written over by the write timestamp.

Recoverability: In this concurrency control method, recoverable histories are not produced. To make recovery possible, we have to employ a scheduler for keeping a list of transactions having read from. Unless there are only committed transactions in the list, a transaction should not be allowed to commit. Also the data produced by uncommitted transactions can be tagged as dirty and read operations should be banned from using such data as a measure against cascading aborts. The scheduler should not permit transactions to carry out any operations on dirty data so as to maintain a strict history. 


Friday, September 12, 2014

What are some major goals of database concurrency control?

Serial or sequential execution of transactions that access the database have no overlapping of time periods and therefore, no concurrency can be achieved in database systems following such a simple execution mechanism. But at the same time, if you start working out the combinations, if concurrency does exist by allowing interleaving of the operations of the transactions, we risk getting some undesirable results because of the improper control over concurrency. Below we give some examples:
- The dirty read problem: One transaction say A reads a data item which has been already written by another transaction say B which later aborted. This value is a result of an aborted operation and therefore is not valid and should not be read by other transactions. This is called dirty read. As a result of this, the transaction B will produce incorrect results.
- The lost update problem: A transaction B writes a data item for the first time which has already been written by another concurrent transaction A while still in progress. In this case we lose the value written by transaction B to transaction A because of overwriting.  According to the rules of precedence the first value should be read by the transactions first before the second value. As a consequence of this, the transactions yield wrong results.
- The incorrect summary problem: We have one transaction A in execution that considers all the values of the multiple instances of a single data item and we have another transaction B whose operations change the value of some of the instances. Therefore the end result does not reflect the correct summary. This is necessary for correctness.  Also it is possible that certain results might have not been included in the summary depending on the time instances at which the updates were made.

The database systems that require high transactional performance need that there concurrent transaction are executed properly so as to meet the goals. In fact, for modern businesses, a database cannot even be considered which does not meet such goals. Now what are these goals? Let us see below:
- Correctness: For attaining this goal, it is important that the system allows the execution of only the serializable schedules. If there is no serializability, we might face the above listed problems. Serializability can be defined as the equivalency of a schedule to some serial schedule having the same transactions. The transactions must be sequential in nature void of any time overlaps and isolated. The highest isolation level can be obtained only through serializability. In some cases, serializability might be cut down for allowing system to give better performance. It might also be relaxed in distributed environments for satisfying their availability. But this is done only on the condition that there is no compromise with correctness. A practical example is of the transactions involving money. If we relax serializability here, money can be transferred to the wrong account. Serializability is achieved by concurrency control mechanisms by means of the conflict serializability. This is nothing but a special case of serializability.
- Recoverability: This is another major goal which insists that after a crash the database must be able to recover efficiently without losing the effects of the successfully committed transactions. Recoverability ensures that the database system is able to tolerate the faults and might not get corrupted because of the media failure. The responsibility of the protection of the data that system stores in its database is given to the recovery manager. This component works together with the concurrency control component for providing an all time available and reliable access to data. Together these two components keep the database system in a consistent state.

The database systems must ensure that the integrity of the transaction is not affected by any kind of fault. In the modern world, with high frequency of transactions, and large monetary values at stake, any problem or violation of the above goals can be very expensive.


Thursday, September 11, 2014

Database access: What is Multi - version concurrency control?

A common concurrency control method used by the database management systems is the multiversion concurrency control often abbreviated as MVCC or MCC. The method helps in regulating the probelm causing situation of concurrent access to the database. Plus this concurrency control method is used for implementing transactional memory in the programming languages. When a read operation and another write operation is being carried out on the same piece of data item in a database, the database is likely to appear as inconsistent to the user. Using such half – written piece of data further is dangerous and can lead to system failure.
Concurrency control methods provide the simplest solution to this problem. These make sure that no read operation is carried out until the write operation is complete. In this case, the data item is said to have been locked by the write operation. But this traditional method is quite slow, with processes waiting for read access having to wait. So we use a different approach to concurrency control as provided by MVCC. At a given instant of time, a snapshot of the database is shown to each connected user.
Changes made by a write operation won’t be visible to the other database users until it is done completely. In this approach when an item has to be updated, the old data is not overwritten with the new data. Instead the old data is marked as obsolete and the new data is added to the database. Thus, the database has multiple versions of the same data but out of them only one is up to date. With this method the users are able to access the data that they were reading when they began the operation. Now it doesn’t matter whether the data was modified or deleted by some other thread. Though this method avoids the overhead generated by system in filling memory holes, it does require the system to periodically inspect the database and remove the obsolete data.
In case of the document – oriented databases, this method allows optimization of these documents by storing them on to a contiguous memory section of the disk. If any update is made to the document, it is simply rewritten. Thus, having multiple pieces of the documented maintained through links is not required. One can have point in time views in MVCC at a certain consistency. Any read operations carried out under MVCC make use of either transaction ID or a timestamp for determining what state the database is in. this avoids the need for lock management or  reading the transactions by isolating the write operations.
Only the future version of the data is affected by the write operation whereas the transaction ID on which the read operation is being carried out remains consistent because it’s a later transaction ID on which the transaction ID is working. The transactional consistency is achieved by MVCC by means of increasing IDs or timestamps. Another advantage of this concurrency control method is that no transaction has to wait for an object since it maintains many versions of the same data object. Each version is labeled with a timestamp or an ID. But, MVCC fails too at certain points. First of all true snapshot isolation cannot be achieved by MVCC. In some cases read – read anomalies and skew write anomalies also surface. There are two solutions to these anomalies namely:

- Serializable snapshot isolation and
- Precisely serializable snapshot isolation

But these solutions work at the cost of transaction abortion. Some databases which use MVCC are:
- ArangoDB
- Bigdata
- CouchDB
- Cloudant
- HBase
- Altibase
- IBM DB2
- Ingres
- Netezza


Sunday, September 7, 2014

Some tools that can be used in Test Driven development

Here we present a list of tools that can be used for carrying out Test Driven Development (TDD). The test driven development is equivalent to the Big upfront designing. The efficiency and the development speed both are improved in this process. Finding mistakes is quite fast and cheap in TDD. The iterations are short and therefore provide a means for frequent feedback. The test suites are up to date and executable and serve as a complete documentation in every sense. The tests are written before the code. Then the code is refactored to produce high quality code.

Tools for TDD are:
• JUnit: this is the java’s unit – testing framework.
• Some commonly used refactoring tools are RefactorIT, IntelliJ IDEA, Eclipse, Emacs and so on.
• HTTPUnit: This is a black box web testing tool and is used for automated web site testing.
• Latka: A java implemented functional testing tool. This tool can be used with JUnit or Tomcat.
• Cactus: A server – side unit testing tool provides a simple framework for unit testing on server side. It also extends on JUnit. The tool provides three types of unit tests i.e., the code logic unit testing, functional unit testing and the integration unit testing.
• Abbot and Jemmy: Tools for GUI testing. The first one keeps a scripted control on actions based on high level semantics. Jemmy is more powerful and provides full support for swing.
• Xmlunit: This tool lets you draw comparisons between 2 xml files.
• Ant: The tool is based up on java and xml. It is completely platform independent.
• Anteater: An Ant based testing framework. It is used for writing tests that check the functionality of web services and web applications.
• Subversion: This one is a replacement for CVS and controls directories in a better way.
• Jmock: The tool uses mock objects that are dummy implementations of the real functionality and are used for checking the behaviour of the code. The non – trivial code cannot be tested in isolation. Unit tests can be used for everything right from simplification of test structure and cleaning up domain code. You should start from testing one feature at a time and rooting out the problems one by one. Carrying out the testing normally without using any mock objects can be hard. Before carrying out the test, you should decide what needs to be verified. Then you must show that it passes the test. Only after this, the mock objects can be added for representing these concepts.


csUnit
CUnit
Googletest
DUnit
HTMLUnit
NUnit
OUnit
PHPUnit
pyUnit

The above mentioned tools are used for writing unit tests in Test Driven development. However, these unit tests are bound to have errors. Also the unit tests are not apt for finding errors resulting because of interactions between different units. The path to success involves keeping things as simple as possible.
The end result of TDD is highly rewarding. The program design is organic and consists of loosely coupled components. All that you think can go wrong should be tested by proceeding in baby steps. One thing to be taken care of is that the unit tests should run at their 100%. Refactoring is not about changing the behaviour of the code. It is about making improvements to the internal code structure. It should be carried out to improve the quality, maintainability and reliability of the code.


What is the difference between Test Driven development and Acceptance Test Driven development?

Test driven development (TDD) and Acceptance Test Driven Development (ATDD) are related to each other. TDD is a developer’s best tool for helping in creating properly written code units i.e., modules, classes and functions, that carry out the desired function properly. On the other side, ATDD acts as a tool for communication between the developers, testers and the customers to define requirements. Another difference between TDD and ATDD is that the former requires test automation while ATDD does not. But ATDD might require automated regression testing. The tests written for ATDD can be used for deriving tests for TDD. This is so because a part of the requirement is implemented in the code. It is important that the tests must be readable by the customers, but it is not required for TDD tests.
TDD involves writing the test cases and making them fail before the code is written. Next enough code is written to make the tests pass and then refactor it as per the requirements. The tests must pass after it. The primary focus of TDD is on methods and single classes i.e., the low level functionality. This leads to much flexible code. Now what is the need for ATDD? The reason is that the TDD just tells you that your code is fine but it doesn’t tell you why that piece of code is even required. However, in ATDD, in the early stages of the software development process itself the acceptance criteria are stated. In the succeeding stages of development process this criteria is used for guiding the development. ATDD is more like a collaborative activity engaging everyone from developers and testers to business analysts and product owners. It ensures the implementation process is understood by everyone involved in the development process.
There is a third thing called the BDD or the behaviour driven development. This one is quite similar to TDD except that we call tests as specs. The focus of the BDD is on system’s behaviour rather than its implementation details. It also focuses on the interactions taking place in software development. Developers following this approach make use of two languages i.e., the domain language and their native language. TDD consists of unit tests while ATDD uses acceptance tests. Plus, the focus is on high level. BDD is required for making the product more focussed on customer. BDD allows the collaboration of developers with other stakeholders easy. Using TDD techniques and tools requires technical skills because you have to know the details of the object. Non – technical stakeholders would be lost while following the unit tests of TDD. BDD provides a clearer view of the purpose of the system. TDD provides this view to the developers.
The result of the methodology of ATDD is entirely based up on the communication that takes place between the developers, testers and their customers. A number of practices are involved in ATDD such as Behavior driven testing, specification by example, story test – driven development, example – driven development and so on. With these processes, the developers are able to understand the needs of the customers before the actual implementation. The difference between TDD and ATDD arises because of the latter’s emphasis on the collaboration between the three i.e., developers, testers and customers. The acceptance testing is encompassed in ATDD but in one way it is similar to TDD i.e., it insists on writing the tests before coding can be started. These tests provide an external view of the system from the point of view of a user.


Monday, September 1, 2014

What is Commitment ordering method for database concurrency control?

Commitment ordering or CO is a class that consists of techniques for implementing interoperable serializability in concurrency control mechanism of the transaction processing system, database systems and other applications related to database management. With the use of commitment ordering methods we can have non – blocking or optimistic implementations. With the advent of multi – processor CPUs, there has been a tremendous increase in the employment of CO in transactional memory (software transactional memory to be particular) and concurrent programming. In these fields CO is used as a means for having non – blocking serializability.
In a schedule that is CO compliant, there is compatibility between the chronological order of the events and precedence order of respective transactions. Conflict serializability when viewed with a broad meaning is nothing but CO. it is highly effective, offers high performance, reliability, distributable, scalable etc.; with these qualities it is a great way of achieving modular serializability across a heterogeneous database systems collection i.e., the one which contains database systems employing different concurrency control methods. A database system that is not CO compliant is linked to a CO component such as COCO – commitment order coordinator. The purpose of this component is put the commitment events in order so as to make the system CO compliant. This also removes access to data and any interference in the operation of transactions. All this leads to reduction of overhead and we get an appropriate solution for distributed serializability and global serializability. A fundamental part of this solution is the atomic commitment protocol or ACP which is used in breaking the cycles present in the conflict graph. This graph can either be a serializability graph or a precedence graph. If the concurrency control information is not shared among the involved database systems beyond ACP messages or if they don’t have any knowledge about the transactions, then for achieving global serializabiolity, CO becomes the absolutely necessary condition.
Another advantage of CO is that its local concurrency information distribution is not costly. This information includes Timestamps, tickets, relations, locks and the local precedence relations etc. it makes use of SS2PL property. SS2PL used with 2PC (two – phase commit protocol) becomes the de facto standard through which global serializability can be achieved. This also creates a transparent process through which the other CO compliant systems can join such global solutions. When a multi – database environment is based upon commitment ordering, the global deadlocks can be resolved automatically without requiring human intervention. This is an important benefit of having CO compliant systems. There is another concept where we intersect CO and strictness called as the strict commitment ordering or SCO. This results in a better overall throughput, shorter execution times for transactions and thus better performance when compared to the traditional SS2PL. The positive impact of having SCO can be felt during lock contention. The same database recovery mechanism can be used by both SCO and SS2PL by virtue of the strictness property. Today we have two major variants of CO namely:
- CO – MVCO and
- CO – ECO
The first one is the multi - version and the second one is called the extended version. Any concurrency control method that is relevant can be combined with these two for employing non – blocking implementations. Both make use of additional information for making relaxations to the constraints and for better performance. A technique is used by CO and variants called the Vote Ordering or VO – a container schedule set. In case of absence of concurrency control information sharing, global serializability can be guaranteed only if there is local VO. The inter – operation of CO and variants is quite transparent which makes automatic deadlock resolution possible also in the heterogeneous environments. 


Wednesday, August 27, 2014

What is Strong strict Two-Phase locking?

The strong strict two – phase locking is the life – saver concept of a database system. We might call it as rigorous scheduling, rigorous two – phase locking or rigorousness. In short it is written as SS2PL. To comply with this protocol, both the read (S) locks and the write (x) locks are released by the locking protocol that has been made by a transaction. But the locks are released only after the complete execution of the transaction or if the transaction aborts midway. Also this protocol follows with the S2PL rules. A transaction that obeys this protocol is said to be in phase – 1 and will continue to be in the same phase till it completes its execution. There is no degenerate phase – 2 in such transactions. Thus, we have only one phase but still we say two – phase because of the fact that the concept has derived from 2PL which is its super class.
A schedule’s SS2PL property is also called as rigorousness. The same name is also used for the schedule class exhibiting this property. And so an SS2PL schedule is often characterized as a rigorous schedule. People mostly prefer to use this term since it does not follow the legacy of using ‘two phase’ (unnecessarily) but it also independent of the locking protocols. The mechanism used by this property is known as rigorous 2PL. The S2PL’s special case is SS2PL which means that it is a proper sub – class of S2PL. Most of the database systems use SS2PL as their concurrency control protocol. This protocol is in wide use since the early days of databases in 1970s. It is a popular choice with many database developers because apart from providing serializability, it also imposes strictness which is nothing but a type of cascadeless recoverability.
Strictness is very much important for efficient recovery of the database in event of failure. For a database system to participate in a distributed environment, committment ordering or CO is needed which in turn comes from strictness. Global serializability and serializability solutions based upon CO are implemented. An implementation of  distributed SS2Pl that does not depends on DLM or distributed lock manager is a subset of commitment ordering method. There is no problem with distributed deadlocks as they are resolved automatically.
Global serializability can be ensured by employing SS2PL for the multi–database systems. Though this fact was known way too long before the arrival of the CO concept, it is with this concept that we are able to understand the atomic commitment protocol’s role in the maintenance of this serializability and resolution of the deadlocks. The fact that the SS2PL has properties inherited from CO and recoverability is more significant than the fact that it is a subset of 2PL. 2PL just has the primitive serializability mechanism and therefore is not capable of implementing SS2PL with other qualities. S2PL i.e., strictness combined with 2PL is not of much practical use. Contrary to S2PL, SS2PL provides the properties of commitment ordering also.
Today we have a number of variants of SS2PL, each having different semantics and used under different conditions. Multiple granularity locking is one such popular variant. Any two schedules  which are either incomparable or one among them contains the other one, have common schedules. Locks are the main culprits for causing blocks between the transactions. This mutual blocking leads to deadlocks – a condition where the execution of the transactions seems to go nowhere. In order to release the trapped resources the deadlocks need to be resolved. A deadlock occurs if we get a cycle in the precedence graph.


Facebook activity