Open-Source Internship opportunity by OpenGenus for programmers. Apply now.
In this article, we have explored how to design and implement a Dictionary console application in C Programming Language. You should follow this guide and develop your own version. This is a strong addition to SDE Portfolio.
INTRODUCTION
Most softwares for embedded systems are written in the C or C++ language because a C/C++ source code is easily converted to assembly or binary through compilation. We can say that C++ is a superset of C because it is the C language with object orientation (classes, polymorphism, inheritance, modifiers etc ).
The greatest advantage of coding in C/C++ is that it allows the programmer to directly access low level computer resources through tools such as pointers , memory allocation and deallocation functions and the asm
keyword enabling the execution of in line assembly. There is also the register
keyword which suggests the OS to use a processor register as a memory for a variable instead of the RAM for better performance.
Coding in C
Coding in C is not easy at all. It is much more difficult to write and compile a C program than any other high-level programming language program like Java or python. The low level tools provided should be handled with precision, the least fault could lead to errors. Adding to this, the syntax is not always easy to understand.
Other programming languages tend to reduce ambiguity in their syntaxes with extensive use of keywords in English language, predefined complex data types to hide the implementation of pointers and no function for direct memory allocation. C language is usually used to write programs which require extensive system resources.
What to figure out
A dictionary is a piece of software which interacts with a user to provide the definition(s) for a word (phonetics , part of speech, meaning , examples etc). We have a file where we store words and their corresponding meaning so that each time someone comes and ask for a particular word, we search in the file and return the meaning of the word if it is there.
Designing the dictionary
C is a modular programming language, based on the functional approach. Modular programming means that a program can be divided into modules. Functional approach means that a program is constructed by applying and composing functions.
structure
We can represent the dictionary application as shown on the diagram below.
- The application consists of two modules.
- Each module contains a header(.h) and a C file (.c).
- Headers are used for declarations, hence serve as abstractions similar to interfaces in Object Oriented programming.
Get your copy of the application by running the command;
git clone https://github.com/OpenGenus/dictionary-in-c.git
directory-operations.h
#include <unistd.h>
#include <sys/types.h>
#include <pwd.h>
#include <stdlib.h>
char* get_homedir();
file-operations.h
//declarations for functions related to data_base
#include <stdio.h>
//maximum column number for a line in data_base
#define MAX_LINE_COL 100
FILE* prepare_file(char* file_name, char* mode);
char* to_upper_string(char* target_string);
char* read_file_line(FILE* f );
int search_word(char* target , FILE *f);
int is_new_word(char* temp, char* searched_word);
void print_results( FILE *f , char* searched_word );
- C files contain implementations of the functions declared in header files, serving as implementations similar to classes in Object Oriented programming.
directory-operations.c
//implementation of directory-operations.h
#include "directory-operations.h"
/**
* @brief get value of home directory
* @return a pointer containing the value of the current home directory
*/
char* get_homedir(){
char *homedir;
if ((homedir = getenv("HOME")) == NULL) {
homedir = getpwuid(getuid())->pw_dir;
}
return homedir;
}
file-operations.c
//implementation of file-operations.h using required libraries
#include "file-operations.h"
#include <stdlib.h>
#include <string.h>
#include <ctype.h>
/**
* @brief open data_base file
*
* @param file_name data_base
* @param mode
*
* @return pointer to data_base
*/
FILE* prepare_file(char* file_name, char* mode) {
FILE* f = fopen(file_name,mode);
int i = 0;
char chunk[MAX_LINE_COL];
//move the file pointer to the first word of the dictionary
while( i<27 ) {
fgets(chunk, sizeof(chunk),f);
printf("%s", chunk);
i++;
}
return f;
}
/**
* @brief converts all string characters to uppercase
*
* @param target_string target string
* @return char* res converted sequence of characters
*/
char* to_upper_string(char* target_string) {
int i = (int)strlen(target_string);
char* res = (char*) malloc(sizeof(char)*(i+1));
res[i] = '\n';
while(i--) {
res[i] = toupper(target_string[i]);
}
return res;
}
/**
* @brief read a single line from a file
*
* @param f the file to be read
*
* @return char* string containing the character readed
*/
char* read_file_line(FILE* f ) {
char chunk[MAX_LINE_COL];
size_t len_used=0;
char * line = (char*) malloc(sizeof(char));
int i=0;
line[0] = '\0';
fgets(chunk, sizeof(chunk),f);
len_used = strlen(chunk);
if ( (line=realloc(line,len_used)) == NULL) {
printf("unable to allocate memory for results\n");
free(line);
exit(1);
}
for (i=0; i<len_used; ++i) {
line[i] = chunk[i];
}
return line;
}
/**
* @brief search a word from data_base
*
* @param target target word
*
* @param f file to be queried
*/
int search_word(char* target, FILE *f) {
if (f == NULL) {
printf("An error occured during the opening of the database file\n");
exit(1);
}
char* temp = (char*)malloc(sizeof(char));
while ( ( (temp = read_file_line(f)) ) != NULL ) {
if (feof(f)) break;
if ( (strcmp(temp,target)) == 0 ) {
return 1;
}
}
return 0;
}
/**
* @brief check if a given line corresponds to a new word (end of the current word's definition)
*
* @param temp line being read
*
* @param searched_word current word
*/
int is_new_word(char* temp, char* searched_word) {
int i=0, len = (int) strlen(temp);
if( (len==1) && temp[0] == '\n' ) return 0;
for (i=0 ; i<len-1 ; ++i) {
if( (int)temp[i]<65 || (int)temp[i]>90 ) return 0;
}
if ( strcmp(temp,searched_word) == 0 ) return 0;
return 1;
}
/**
* @brief print the results of a searched word
*
* @param searched_word
*
* @param f file searched
*/
void print_results( FILE *f , char* searched_word ) {
if ( feof(f) ) {
printf("End of file reached\n");
}
while ( 1 ) {
if( feof(f) ) break;
char* temp = read_file_line (f);
if( is_new_word(temp,searched_word)) break;
printf("%s", temp);
}
}
- The file-operation module contains functions which interacts with the dictionary database file.
- The directory-operation module contains functions for locating the working directory of the program.
- The main program calls functions declared in the two header files.
Program flow
The main program is dict.c
#include "directory-operations.h"
#include "file-operations.h"
#include <string.h>
#define DATA_BASE "/mydict/data_base"
#define MODE "r"
int main( int argc , char** argv ) {
if (argc == 2) {
char* target = to_upper_string(argv[1]);
//get the current home directory
char* homedir = get_homedir();
strcat( homedir , DATA_BASE );
FILE *f = prepare_file( homedir , MODE );
if ( search_word(target,f) ){
printf("%s", target);
char *getcwd(char *buf, size_t size);
print_results(f,target);
fclose(f);
}
else {
printf("No definitions found for : %s", target);
}
}
else {
printf("mydict [word]\n");
}
return 0;
}
In the first line, including directory-operations.h in enables us to call the methods it declares. header files should contain only the declaration of functions and not their implementation. #include "directory-operations.h"
copies the code from file-operations.h and pastes it in dict.c, same goes with directory-operations.h
Next define a macro DATA_BASE
whose value is the relative path to the database file (/mydict/database).
Using macros is encouraged than using string literals since the latter are prone to errors. Your code is cleaner and easier to understand with macros and theirs values can be changed easily, no matter the number of places where you used them inside your code.
MODE
is a macro which corresponds to the mode in which the database file will be opened. In "r" which is read mode in our case.
main
is the program which executes when we run the application, it takes two parameters.
argc
(argument count) is an integer that indicates how many arguments were entered on the command line when the program was started. It's value is always greater than or equal to one.
argv
(argument vector) is a double string pointer which references the command line arguments. argv[0]
is a string pointer which references the name of the executable program (by default) , this is while argc
is always at least 1. argv[1]
references the first command line argument (the word we enter)
if (argc==2)
checks if a single word is passed to as parameter to the program.to_upper_string()
copies this word, converts it to a new one in upper case and returns a string pointer to this new word. This implies thattarget
points to an upper case string similar to the word passed as command line argument. This conversion is necessary because the words to be searched in data_base are in upper case.get_homedir
gets the home directory of the current user.homedir
therefore points to a string which is the home directory ($HOME).strcat()
concatenates the stringshomedir
andDATA_BASE
which results to the absolute path to the database file iehomedir
now points to "/home/your_name/mydict/data_base".- Knowing the absolute path of the database file, we create a file pointer (
f
) which points to this file usingprepare_file()
.prepare_file()
opens data_base in read mode and returns its corresponding file pointer. if ( search_word(target,f) )
searches the string pointed bytarget
in data_base, this returns 1 or 0 (true or false). If the search result is true (1) ,print_results()
prints the lines of data_base starting fromtarget
to the end of its definitions, then the data_base is closed usingfclose()
, meanwhile if the result of the search is false (0) , we simply tell the user that there is no definitions for the word he is searching.- All of the above happens when
argc==2
, if this is not the case, the syntax for running the program which ismydict [word]
is displayed on the command line, meaning that only one word can be searched at at a time. - return 0 marks the end of the program.
Compilation
Each time we write a function and integrate it to the main program, we need to test the integrated program and again for us to test this integrated program, it must be compiled. Having many files makes compilation difficult , it is not easy to remember which files were updated before starting to compile them one by one. To solve this problem, we use make.
make is an automation tool for running tasks. We write compilation recipes inside a file called Makefile or makefile and execute the file using the command make. Open the file Makefile.
#GNU make default shell is sh, set it to bash
SHELL := /bin/bash
OBJECTS = dict.o file-operations.o directory-operations.o
HEADERS = file-operations.h directory-operations.h
CFLAGS = -Wall
FILE_OPER = file-operations.c file-operations.h
DIR_OPER = directory-operations.c directory-operations.h
#.PHONY avoids conflicts with files in the directory
.PHONY: all
all : mydict
mydict: $(OBJECTS)
cc -o mydict $(CFLAGS) $(OBJECTS)
dict.o: dict.c $(HEADERS)
cc -c $(CFLAGS) dict.c
file-operations.o: $(FILE_OPER)
cc -c $(CFLAGS) file-operations.c
directory-operations.o: $(DIR_OPER)
cc -c $(CFLAGS) directory-operations.c
.PHONY: clean
clean:
rm -f $(OBJECTS)
#run the install.sh script
.PHONY: install
install: install.sh
source install.sh
As I said, a make file is made up of a set of recipes. A recipe is a set of instructions for achieving a particular goal (in general). A recipe in make is a triplet made up of a target, a set of prerequisites and a set of commands.
-
A target is a goal or output of a recipe (can be a file or a name).
-
A prerequisite is a file that must exist for the target to be created (not all targets need prerequisites).
-
A command is an action to be performed
-
We can have variables in make, as in any other programming language. We have the variables
HEADERS
,OBJECTS
,FILE_OPER
,DIR_OPER
which contain file names and the variableCFLAGS
which contains the C flag-Wall
for warning generation during compilation. It's safer and cleaner to have variable to store file names than to be writing them every where in the Makefile (which can lead to errors). -
.PHONY is a special target used to identify phony targets. Phony targets are targets which are not files, but just names. These target are essential when you want to perform a given action which describes a particular process, such as installing or cleaning.
phony targets avoids conflicts with existing files of the same name. -
A target can be a file or just a name. mydict is a target (the final executable file) which has as prerequisites the
$(OBJECTS)
. We use the dollar sign $ and parenthesise () to access the content of a variable. The command which follows links the object files to obtain the mydict executable.
Since the object files are not yet obtained and are required here, the execution of the makefile follows building these files as targets by compiling their source files which are available , before finishing to build the mydict target. Generally, this recursive process continuous until the last prerequisites are found in the current directory.
the make utility tracks the state of files, and builds a target file only when it is changed. -
The clean target removes the object files produced during compilation process.
-
The install target installs the executable file. This involves sourcing (executing) the install.sh file in this directory.
-
To execute the makefile we run the command make, this builds the default target all and hence the mydict executable.
-
run
make clean
to remove the object files produced during the compilation step -
run
make install
to install the executable file. Make sure that you changed the execution mode of install.sh before running this command. Open the README.md file to get the order of command execution
# dictionary-in-c
## compilation
#### $ sudo chmod 777 install.sh
#### $ make all
#### $ make clean
#### $ make install
The file install.sh contains shell commands for the installation of the application.
#!/bin/bash
#create the mydict directory if absent
if [ ! -d $HOME/mydict ]
then
mkdir $HOME/mydict
fi
#copy the necessary files
cp data_base $HOME/mydict
cp mydict $HOME/mydict
#enable mydict command to be accessed every where for this user
echo "export PATH=$PATH:$HOME/mydict/" >> $HOME/.bashrc
source $HOME/.bashrc
The last command in the makefile consists of executing install.sh. This is installs the dictionary. Installation is just copying the executable file and the database file in the necessary locations and making thme accessible.
- The first line
#!/bin/bash
is termed the shebang, which instructs the OS to execute install.sh using bash. - The line
if [ ! -d $HOME/mydict ]
checks if the directory /home/your_name/mydict doesn't exists and creates it in this case. - The two
cp
commands copy the executable mydict and the database file data_base in the directory just created. - The
echo
command prints the lineexport PATH=$PATH:$HOME/mydict
in the .bashrc file present in your home directory. This line adds the mydict directory to your $PATH (path variable). - The last command
soure $HOME/.bashrc
consists of executing the .bashrc file so that the path variable should be updated.
After executing this file you can type mydict in any directory, you will get a result. The image below shows some of them.
- Running the command with no argument
-
Running the command with a non-existing word
-
Running the command with an existing word
Conclusion
With this article at OpenGenus, you must have the complete idea of developing dictionary console application in C Programming Language.
A more sophisticated dictionary could be designed with a better database like MySQL. MySQL database management system allows us to obtain the definitions of a word as a set of data, hence each data can be stored in a particular variable. This will ease the formatting of data, rather than reading a whole text from a file and writing it on the console. The header file <mysql.h>
provides functions for connecting and interacting with MySQL databases.