# 2-SAT problem

Get this book -> Problems on Array: For Interviews and Competitive Programming

**Boolean satisfiability** is a **NP-complete problem** but, a special case of it can be solved in **polynomial time**. This special case is called cas **2-SAT or 2-Satisfiability**.

We have explained the basic ideas to understand this problem with depth along with solution.

# Boolean Satisfiability

## Boolean formulas

Boolean formulas are ones where the variables can take values 0 (false) or 1 (true). Some operators in Boolean formulas:

Name | Operator |
---|---|

not | $ \neg a $ |

and | $ a \wedge b $ |

or | $ a \vee b $ |

implies | $ a \implies b $ |

equivalent | $ a \iff b $ |

Note: $ and $ is also called as conjunction and $ or $ is also called as disjunction.

The resultant truth value of the operators depends on the truth values of the variables. Example: `not`

of a variable is true if the value of the variable is false.

## Important properties

Both conjunction and disjunction operators are commutative and associative.

### Property 1:

$ (a \implies b) \iff (\neg a \vee b) $

The LHS and RHS are equivalent and can be replaced with one another in a larger formula.

### Property 2:

If, $ (a \implies b) \wedge (b \implies a) $ then,

$ (a \iff b) $.

If $ a $ implies $ b $ and $ b $ implies $ a $ are both true then $ a $ and $ b $ are equivalent.

### Property 3:

If, $ (a \implies b) \wedge (b \implies c) $ then,

$ (a \implies c) $.

## Conjunctive Normal Form (CNF)

Any boolean formula can be represented in a standard form called as conjunctive normal form.

- Literal : it is a variable or it's negation
- Clause : it is a disjunction (or) of literals.
- CNF : it is a conjunction (and) of clauses.

Example:

$$ (a \vee \neg b) \wedge (\neg a \vee b) \wedge (a \vee b \vee c) \wedge (a \vee \neg c \vee d) $$

## Satisfiability

Boolean satisfiability is the problem of finding suitable assignments for all the variables present in a boolean formula such that the formula becomes true.

Examples:

$$ (a \vee b) \wedge (\neg a \vee \neg b) $$

This formula can be made true by assigning $ a $ to true and $ b $ to false (not the only assignment). Thus, the above formula is satisfiable.

$$ (a \wedge \neg a) $$

This formula cannot be made true because $ a $ has to be true and false at the same time.

## 2-Satisfiability

Boolean satisfiability is a NP-complete problem but, a special case of it can be solved in polynomial time. This special case is called cas 2-SAT or 2-Satisfiability.

In 2-SAT, the formula when represented in CNF, has exactly 2 variables in every clause. Such a CNF is called 2-CNF.

Note: NP-complete problems are those which does not have a polynomial time solution. Also, it has not been proved that such a solution does not exist. Every NP-complete problem is reducible to some other NP-complete problem in polynomial time - therefore, finding a polynomial time solution to any NP-complete problem is same as finding a solution to all NP-complete problems.

# Algorithm

## Implication graph

The algorithm to find if a 2-CNF is satisfiable or not first constructs a graph called implication graph.

Implication graph is a directed graph where the nodes are literals present in the formula and a directed edge is present from literal $ u $ to literal $ v $ if $ u $ implies $ v $ can be obtained from some clause in the 2-CNF formula.

Example:

$$ (a \vee \neg b) \wedge (\neg a \vee b) \wedge (\neg a \vee \neg b) \wedge (a \vee \neg c) $$

The implications that can be derived from the clauses in the above formula (using Property 1):

- $ (\neg a \implies \neg b) \text{ and } (b \implies a) $
- $ (a \implies b) \text{ and } (\neg b \implies \neg a) $
- $ (a \implies \neg b) \text{ and } (b \implies \neg a) $
- $ (\neg a \implies \neg c) \text{ and } (c \implies a) $

There are 2 implications per clause as disjunction is commutative. Using the above implications, the graph is constructed as follows:

Note: Repeating implication (edges) should be ignored.

If there is a path from some literal $ u $ to another literal $ v $ in the implication graph, then $ u \implies v $ is true (by repeated application of Property 3 on the literals along the path).

## Satisfiability criteria

The 2-CNF formula represented using the implication graph is satisfiable if and only if no two complementary literals are present in the same strongly connected component (SCC) of the implication graph.

If there exists some $ a $ such that $ a $ and $ \neg a $ are present in the same strongly connected component, then there exists a directed path from $ a $ to $ \neg a $ and from $ \neg a $ to $ a $. As mentioned above, these paths in implication graph means that:

$$ (a \implies \neg a) \text{ and } (\neg a \implies a) $$ are both true at the same time. Then, by Property 2 we have:

$$ (a \iff \neg a) $$

Which is a contradiction, therefore, such a situation should not be present for the formula to be satisfiable.

The strongly connected components in the above image are highlighted below:

It can be seen that no two complementary literals are present in the same SCC, making the above 2-CNF satisfiable.

## Implementation

Note: It is assumed that the reader knows how to find the strongly connected components in a directed graph.

There are many ways to implement this algorithm.

### Kosaraju implementation details

The following C++ implementation uses 3 auxiliary functions `dfs`

, `rdfs`

and `kosaraju`

.

The algorithm requires two versions of the implication graph `g`

and `rg`

. Where `g`

is the actual implication graph and `rg`

is the implication graph with the direction of the edges reversed.

The `dfs`

function is used to find the topological order of graph `g`

, which is stored in the vector `topo`

. The reverse graph `rg`

is traversed according to the order in `topo`

using `rdfs`

, where the id (the function parameter `cur`

) of the SCC is stored for each vertex `u`

in `component[u]`

.

```
const int N = MAX_VARIABLES;
int n;
vector<int> g[2 * N + 1];
vector<int> rg[2 * N + 1];
vector<int> topo;
int component[2 * N + 1];
bool visit[2 * N + 1];
void dfs(int u) {
visit[u] = true;
for(auto v: g[u]) {
if(!visit[v]) {
dfs(v);
}
}
topo.emplace_back(u);
}
void rdfs(int u, int cur) {
visit[u] = true;
component[u] = cur;
for(auto v: rg[u]) {
if(!visit[v]) {
rdfs(v, cur);
}
}
}
void kosaraju() {
for(int i = 1; i <= n; ++i) {
if(!visit[i]) {
dfs(i);
}
}
reverse(topo.begin(), topo.end());
memset(visit, false, sizeof visit);
int cur = 0;
for(auto u: topo) {
if(!visit[u]) {
++cur;
rdfs(u, cur);
}
}
}
```

As an alternative, Tarjan's algorithm can also be used to achieve the same.

## Implication graph construction and 2-SAT

Implication graph is constructed from the 2-CNF input. The input is in the form of a vector<vector<int>>, where each internal vector is of size 2 (2 literals). The literals are denoted by integers - positive if the literal is positive and negative if the literal is negative. The CNF formula:

$$ (a \vee \neg b) \wedge (\neg a \vee b) \wedge (\neg a \vee \neg b) \wedge (a \vee \neg c) $$

Can be expressed as:

```
{
{1, -2},
{-1, 2},
{-1, -2},
{1, -3}
}
```

Where 1, 2, and 3 are $ a $, $ b $ and $ c $ respectively.

The following C++ implementation is used to construct an implication graph and find if the formula is 2-SAT or not.

The function `getVertexId`

is used to find the vertex index for a given literal, every odd index is the positive literal and the very next even index is the corresponding negative literal.

Example: literal `1`

in the above input is vertex `1`

and literal `-1`

is vertex `2`

.

The function `getComplementVertex`

finds the vertex index of a literal's complement given the vertex index of the literal.

```
int getVertexId(int literal) {
// every odd index is the positive literal
// and the very next even index is the corresponding negative literal
if(literal < 0) return 2 * abs(literal);
else return 2 * literal - 1;
}
int getComplementVertex(int vertex) {
if(vertex&1) return vertex + 1;
else return vertex - 1;
}
bool is2Sat(int var_count, vector<vector<int>> cnf) {
// construction of implication graph (g)
n = 2 * var_count;
for(int i = 0; i < cnf.size(); ++i) {
int u = getVertexId(cnf[i][0]);
int v = getVertexId(cnf[i][1]);
g[getComplementVertex(u)].emplace_back(v);
g[getComplementVertex(v)].emplace_back(u);
// reverse graph for kosaraju's algorithm
rg[v].emplace_back(getComplementVertex(u));
rg[u].emplace_back(getComplementVertex(v));
}
kosaraju();
// satisfiability criteria check for all literals
for(int i = 1; i <= n; i += 2) {
if(component[i] == component[i + 1]) {
// violation
return false;
}
}
return true;
}
```

The result for the above input will be true because no two complementary literals are in the same SCC as seen in the image.

## Finding a satisfiable assignment

Assuming that the satisfiablity constraint holds in the given formula, some assignment to the variables can be found using the `component`

array.

`component[u]`

is the identity of the SCC in which vertex `u`

is present.

Note: If the directed graph is not disconnected, then identities of the SCCs are unique and always in topological order when they are assigned i.e) if there are 2 SCCs in a graph whose ids are $ x $ and $ y $; if $ x $ < $ y $, then there is a path from SCC $ x $ to SCC $ y $ but the converse is not possible (because $ x $ and $ y $ will be identities of the same SCC).

If the satisfiability constraint holds, then for every literal $ a $, only one of the following can be true:

- component[$ a $] < component[$ \neg a $] ($ \neg a $ is reachable from $ a $): In this case $ (a \implies \neg a) $ is true, therefore, $ a $ must be false.
- component[$ \neg a $] < component[$ a $] ( $ a $ is reachable from $ \neg a $): In this case $ (\neg a \implies a) $ is true, therefore, $ a $ must be true.

```
vector<bool> getAssignment(int var_count) {
int lit_count = 2 * var_count;
vector<bool> assignment(var_count + 1, false);
for(int i = 1; i <= lit_count; i += 2) {
assignment[i / 2 + 1] = component[i + 1] < component[i];
}
return assignment;
}
```

For the above example, truth values of $ a $, $ b $ and $ c $ will be false because their complements $ \neg a $, $ \neg b $ and $ \neg c $ respectively, are reachable from them.

## Complexity

Time complexity for finding if a 2-CNF is satisfiable or not is `O(m + n)`

, where `m`

is the number of variables and `n`

is the number of clauses (for finding SCCs).

Time complexity for implication graph construction is `O(n)`

.

Time complexity for finding assignment is `O(m)`

.

Memory usage is `O(n + m)`

- for the reverse graph in Kosaraju's algorithm and other auxiliary arrays like `component`

and `visit`

.

# Applications

## Scheduling

2-SAT can be used to verify if a given classroom schedule consisting of n teachers and m classrooms has some conflict (Eg: 2 teachers assigned to the same classroom at the same time).

## Renamable Horn satisfiability

Another special sub-class of boolean satisfiability is renamable horn satisfiability, where there can be any number of literals per clause but at most one positive literal. This can also be solved in polynomial time with the aid of 2-SAT algorithm.

## Sat Solvers

Sat solvers are tools that solve the general boolean satisfiability problem, they are based on other complex algorithms (Eg: conflict driven clause learning) and heuristics. 2-SAT algorithm is one of the optimizations that can be found in many sat solvers.

## Question

#### If the second clause in the example is replaced with $ a \vee b $, what will be the truth values of {a, b, c}? (1 - true, 0 - false, UNSAT - if the formula is not satisfiable)

## Question

#### If the final clause in the example is replaced with $ a \vee b $, what will be the truth values of {a, b, c}? (1 - true, 0 - false, UNSAT - if the formula is not satisfiable)

With this article at OpenGenus, you must have the complete idea of 2-SAT problem. Enjoy.