Another long-term project and once again for internal usage, but this one's special! Ok, now, close your eyes, and let your imagination off the hook. You have a company that functions since 1953. They have a legacy app (created in the 90s) that needs total rewrite from the ground up. This legacy app contains data since 1970s stored in dBase (before dBase it was excel, and before that paper, but somebody has already done that migration in 1998). You have around a year to migrate the data to MSSQL, create the new app and add new features. Welcome to my situation! I really enjoyed this rewrite, as it was unique and highly rewarding (even tho I'm usually against rewrites).
Showcase
Data migration
I started by looking at the data and migrating them to a new database. This step alone took a long time, because I was unable to find any ready-to-use solution that could migrate .dbf (dBASE) files into MSSQL (at least not at the scale I needed), so I had to come up with my custom quick-and-dirty solution using Dapper and DbfDataReader packages (feel free to steal it next time you need to migrate dBASE files to MSSQL or any other database :D).
using Dapper;
using DbfDataReader;
using System;
using System.Collections.Generic;
using System.Data;
using System.Data.SqlClient;
using System.Globalization;
using System.IO;
using System.Linq;
using System.Text;
// Hey, just to clarify before anyone brings it up, I do realize that this code has some SQL injection, security vulnerabilities and bad security practices,
// however, since it's just a small utility tool that was/will be used by the ADMIN only once or twice, I'm not too worried about it. :)
namespace YourNamespace.YourProjectName
{
class Program
{
// Could be constant, but in the private version of the code, there is more stuff going on behind the scenes to obtain connection string
private static protected string ConnectionString => "your connection string, you shouldn't hardcode connection strings, but this is utility tool and was used like once or twice. Just make sure to not push credentials ;)";
private const string prefix = "rk_"; // To easily recognize migrated tables
static void Main()
{
while (true) {
Console.ForegroundColor = ConsoleColor.White;
Console.Write("Migrate tables [dbf] (1), data [dbf] (2), or update tables to usable state (3): ");
char answ = Console.ReadKey().KeyChar;
switch (answ) {
case '1':
CreateTablesDBF();
break;
case '2':
FillInTablesDBF();
break;
case '3':
FormatTableToMakeItUseable();
break;
default:
return;
}
Console.ForegroundColor = ConsoleColor.Green;
Console.WriteLine($"Finished.");
}
}
private static string getTableName(string fileName) => fileName[fileName.LastIndexOf("\\")..].Replace(".dbf", "").Remove(0, 1);
// Specific to my case, because columns had logical flaw
private static string formatColumnName(string columnName) => (columnName == "AS" || columnName == "as") ? "AS_0" : columnName;
// Should be right according to the dBASE standard
private static string getColumnType(DbfColumnType type, int charLen) =>
type switch {
DbfColumnType.Character => $"nvarchar({charLen})",
DbfColumnType n when (n == DbfColumnType.Number && charLen > 0) => "float",
DbfColumnType.Number => "int",
DbfColumnType.Date => "date",
DbfColumnType.Float => "float",
DbfColumnType.Currency => "decimal(18, 18)",
DbfColumnType.DateTime => "datetime",
DbfColumnType.Boolean => "bit",
DbfColumnType.Memo => "ntext",
DbfColumnType.Double => "float",
DbfColumnType.General => "nvarchar(365)",
_ => throw new NotImplementedException("This datatype is not implemented!"),
};
private static void CreateTablesDBF()
{
Console.Write("\nInput folder location with .dbf files: ");
string path = Console.ReadLine();
string[] files = Directory.GetFiles(path);
foreach (string file in files.Where(s => s.EndsWith(".dbf"))) {
string tableName = getTableName(file);
StringBuilder builder = new StringBuilder(
$"IF OBJECT_ID(N'dbo.{prefix}{tableName}', N'U') IS NOT NULL DROP TABLE dbo.{prefix}{tableName};" +
$"CREATE TABLE {prefix}{tableName} (" +
"ID_GENERATED INT NOT NULL IDENTITY PRIMARY KEY,"
);
using (DbfTable dbfTable = new DbfTable(file)) {
foreach (var field in dbfTable.Columns) {
string name = formatColumnName(field.ColumnName);
string type = getColumnType(field.ColumnType, field.Length);
builder.Append($"{name} {type} NULL,");
}
}
string command = builder.ToString().TrimEnd(',') + ");";
using (IDbConnection connection = new SqlConnection(ConnectionString)) {
if (connection.State == ConnectionState.Closed) {
connection.Open();
}
connection.Execute(command);
}
}
}
private static void FillInTablesDBF()
{
int sucess = 0;
int noSucess = 0;
Console.Write("\nInput folder location with .dbf files: ");
string path = Console.ReadLine();
string[] files = Directory.GetFiles(path);
foreach (string file in files.Where(s => s.EndsWith(".dbf"))) {
string tableName = getTableName(file);
List commands = new List {
$"DELETE FROM dbo.{prefix}{tableName};"
};
using (DbfTable dbfTable = new DbfTable(file)) {
StringBuilder columnNames = new StringBuilder();
foreach (var field in dbfTable.Columns) {
columnNames.Append($"{formatColumnName(field.ColumnName)}, ");
}
string commandColumns = columnNames.ToString().TrimEnd().TrimEnd(',');
DbfRecord dbfRecord = new DbfRecord(dbfTable);
while (dbfTable.Read(dbfRecord)) {
if (dbfRecord.IsDeleted) {
continue;
}
StringBuilder commandValues = new StringBuilder();
for (int i = 0; i < dbfRecord.Values.Count; i++) {
string value = dbfRecord.Values[i].ToString().Replace('\'', '\"');
string columnType = getColumnType(dbfTable.Columns[i].ColumnType, dbfTable.Columns[i].Length);
// My case specific
if (value == "- -") {
value = " ";
}
// Depends on your data's date localization. Mine were in czech, so I had to parse it
else if (columnType == "date" && DateTime.TryParseExact(value, "dd.MM.yyyy", null, DateTimeStyles.None, out DateTime d)) {
value = d.ToString("yyyy-MM-dd");
}
// Depends on your data's floating localization. Mine were in czech, so I had to parse it
else if (columnType == "float" && double.TryParse(value, out double dou)) {
value = dou.ToString().Replace(",", ".");
}
// dBASE specific
else if (columnType == "int" && int.TryParse(value, out int v)) {
value = v.ToString();
}
commandValues.Append($"{(string.IsNullOrWhiteSpace(value) ? "NULL" : $"'{value}'")}, ");
}
commands.Add($"INSERT INTO dbo.{prefix}{tableName} ({commandColumns}) VALUES ({commandValues.ToString().TrimEnd().TrimEnd(',')});");
}
}
Console.WriteLine($"------ Finished generating command for {tableName} ------");
insertIntoDb(commands, out int sucessNew, out int noSucessNew);
sucess += sucessNew;
noSucess += noSucessNew;
}
Console.WriteLine($"Executed: {sucess + noSucess}\nSuccesses / Failures: {sucess} / {noSucess}");
}
private static void FormatTableToMakeItUseable()
{
List commands = new List() {
// Insert commands to setup database after data migration (renaming columns, adding columns, altering data types,...)
};
using (IDbConnection connection = new SqlConnection(ConnectionString)) {
if (connection.State == ConnectionState.Closed) {
connection.Open();
}
foreach (string command in commands) {
connection.Execute(command);
}
}
Console.WriteLine(""); // To make output look better :D
}
// Since we're migrating milions of rows, we need to separate the commands to chunks
// (default limits)
// +
// (generally a good idea when pushing large amounts of data, if you don't want to [D]DOS your DB)
private static void insertIntoDb(List commands, out int sucess, out int noSucess)
{
sucess = 0;
noSucess = 0;
using (IDbConnection connection = new SqlConnection(ConnectionString)) {
if (connection.State == ConnectionState.Closed) {
connection.Open();
}
// Pre-process command for sql, split into chunks of 500 commands
// commands.Reverse();
List> commandChunks = commands.Select((x, i) => new { Index = i, Value = x })
.GroupBy(x => x.Index / 500)
.Select(x => x.Select(v => v.Value).ToList())
.ToList();
// For each chunk, make string from chunk and then send it as command
for (int i = 0; i < commandChunks.Count; i++) {
List chunk = commandChunks[i];
string command = string.Join("", chunk.ToArray());
bool saved = connection.Execute(command, commandTimeout: 60) > 0;
sucess += saved ? 1 : 0;
noSucess += saved ? 0 : 1;
Console.ForegroundColor = saved ? ConsoleColor.Green : ConsoleColor.Red;
Console.WriteLine($"------ [{i + 1}/{commandChunks.Count}] {(saved ? "Data successfully saved" : "ERROR")} ------");
Console.ForegroundColor = ConsoleColor.White;
}
}
}
}
}
I would also like to "shout-out", a tool called DBFPlus that has been really helpful for working with .dbf files while developing my utility.
The code may look simple, but the path that led me there was long and hard. There were a lot of problems, mainly because I initially received the data in Excel format (yes, .xlsx), and after a week of: banging my head against the wall, performance + compatibility issues, and much more, I decided to just use the raw .dbf files.
Developing the application
Although it may not look so complicated, there are a lot of little quirks that took me some time. With data migration out of the way, I was ready to start developing! I started by listing all of the necessary + new functionalities and created a pool of tasks. Some core features needed to be done before everything else, like basic CRUD operations, which I assigned a higher priority. Also, since the software does not have to be particularly visually pleasing, I just quickly drew the "design" of the app on a paper and called it "design prototype". Users had some reminders, we went back and forth for a bit on the design and after some time agreed on the final look. After that, I went and incrementally implemented functionality by functionality. It wasn't that hard, it's just that the legacy app contained A LOT of features, which accumulated over the years, and now I had to re-implement them. The functionality shown in the showcase video is like 40% of everything that the program can do, and I could go on and on about everything it can do for hours, however, I would rather focus on interesting parts of the development process.
For example, I had access to the original source code, and Word/Excel API has not introduced breaking changes since the 90s (at least for functions I used)! The code originally written in VBA is still compatible with the current version of the Microsoft Office suite! I found that really refreshing because I was originally worried about calling Word API from C#, however I just had to re-write the original code from VBA to C# and everything worked right away.
Another "challenge" was the need for this program to run 100% locally. This meant that I had to set up internal DB, API, and file storage servers locally. However, since we're storing important data, we should have it backed up somewhere in case something fails (building burns down, hard drive dies, or whatever, also, let's ignore the fact that the hard drives are in a RAID). Luckily, Azure provides a service called Azure Backup (service), which solves that issue for me. Also, this software is, like many others, used by multiple people, so I had to implement a login and login management system. Initially, I wanted to use some plug-and-play solution like Microsoft's Azure Active Directory, however, that wouldn't be viable because of the 100% local rule. Instead, I implemented one myself :D (it ain't much, but it's honest work).
What did I learn?
Overall, I enjoyed this project, because I got familiar with a lot of old tools and ways of writing software. This gave me a new perspective on software development.
I hope you enjoyed this article/blog post!
If you have any questions, problems or want to start a discussion, don't hesitate and write me an email!