Data visualization of legal fees, billing, and amounts left in trust

I created this interactive report using Excel and Power BI to show one possible solution for attorneys who want to see information about their clients, the legal fees they incur, and other information about their case. Although this data is hypothetical, a report like this could be beneficial to the attorney and law firm.

When viewing this report on this website, you can click on the “Fit to page” button located on the bottom right side of the report, zoom in and out of it, and click anywhere in the report to see how the data changes.

In most lawsuits, a client pays an attorney a large dollar amount, called a retainer, at the start of the case, and this money is put into a trust bank account. Attorneys then bill the client for the work done on their case. But attorneys that handle client cases are often one-step removed from the amount a client has left in the trust account. They may not handle the billing and collection of finances from the client, someone else in the firm may do this.

But an attorney may want to know how much is left in the trust account to answer questions of what next steps to take in the case. The report I’ve created can quickly answer many questions the attorney has relating to this.

This report is interactive, and an attorney can click through it to see what legal activities are being performed for the client each month, how much they total, when a trust account has a balance less than a certain amount and should be replenished, when a retainer is paid and more.

I like giving short directions with my Power BI reports because I want the user to understand how to use the visual in order to get the information they want quickly. I also make my reports clean in terms of design; although I appreciate art, data visualizations should be useful and easy to use. Additionally, I like reports that have contact information showing how the user can get help or ask questions about the report. As an attorney myself, I know that getting the information I need quickly is very important.

Power BI is such a great software to use for creating interactive reports, however, the default design elements are not wonderful. Having taught design when I was a professor, I know that color choices can communicate certain information.

For this report, I wanted the colors to mean certain things. So in the table that shows amounts left in trust, blue numbers mean a client has more than $2000 in their trust account. Red numbers mean the trust account balance is getting low, and the attorney may need to ask for another retainer, change how the case is being handled, etc.

For the bar and pie charts, I wanted to group certain legal work together. Since I’ve practiced law, I know that Court Appearances and Travel are billed on the same day. And things like Discovery, Research, and Depositions are similar types of legal work. Whenever I’ve communicated with a client or opposing counsel, I needed to review the file and often did drafting work to motions after our discussions.

So I went to Adobe’s Color Wheel (available here: and selected a double split complementary color scheme. Then I opened the pallete in Adobe Photoshop and and added additional colors there. The result was this color palette:

color pallette

I really enjoyed building this report. If you have any questions about it, or the process, feel free to email me at

U.S. Law Firms’ Business Patterns

I analyzed data collected from the U.S. Census for the years 2017 through 2020 that measures business patterns of U.S. law firms.

Areas of practice include corporate law, family law, estate planning, patent attorneys, legal aid services, real estate law, tax law and others. The data does not include public defenders, prosecutors, US attorneys, or attorney generals.

I used Power BI to clean, transform, and analyze the data. After reviewing the data, I created charts, plots, tables and other visualizations and built an interactive report. Click on the different parts in the report below to see how the data changes.

What can you conclude about this data?

  1. Most law firms in the U.S. have between one and four employees and are S-corporations. This reflects something unique in terms of how most law firms are really tiny in terms of the number of people working for the firm. In fact, the Small Business Association defines “small businesses” as having 250-1500 employees, depending upon the industry! Most small business law firms in the U.S. have very few people who shoulder all of the business and legal responsibilities. They are truly multi-taskers.
  2. The total number of law firms has decreased every year. By clicking on each of the years, we can see that the total number of law firms went from 173,019 in 2017 to 169,582 in 2020. And this data does not include the effect the pandemic had on the existence of U.S. law firms.

However, if we look at the changes in the total number of firms by employee size, our calculations show that the total number of firms with one to 19 employees decreased from 2017 to 2020, while total number of firms with 20 to over a thousand employees increased in that same time period.

All firms173,019169,5822.0% decrease
1 – 4 employees127,001123,8322.5% decrease
5 – 9 employees25,29824,8411.8% decrease
10 – 1911,74211,6460.8% decrease
20 – 496,1556,3112.5% increase
50 – 991,7591,8535.3% increase
100 – 2498258543.5% increase
250 – 4991851871.1% increase
500 – 99945462.2% increase
1000 +91233.3% increase

We can see the change in number of law firms visually as well, although it is slight. The charts below is interactive and shows changes in the number of law firms from 2017 through 2020. You can click on it to see more information and also hover over each vertical year area to see tool-tips that show the exact number of law firms with various employee sizes.

Since I broke apart the number of employees, I decided to do the same for type of organization. What organizational type increased in number from 2017 to 2020? What decreased? Below are the answers.

Right off the bat we can see that S-corporations increased in numbers from 2017-2020, while it appears the other business organization decreased or remained about the same. Just to make sure, I ran the following calculations…

S-corporation 83,40388,6276.3% increase
Sole proprietorships39,99034,07914.8% decrease
Partnerships28,14226,8534.6% decrease
C-corporation19,24017,7397.8% decrease
Non-profit2,2262,2350.4% increase
Other noncorporate2352126.1% increase

Although the large percentage increase in “Other noncorporate” is unique, remember there are very few law firms that are categorized as this versus the other traditional business types. We can add more information to the what we concluded earlier: that overall, the number of law firms has decreased from 2017 to 2020, with the largest employee size being between one to four employees. Although the number of law firms decreased, the change did not happen at the same rate across all business types. Sole proprietorships saw the biggest decrease while S-corporations actually increased in numbers.

Finally, the Census data I pulled was only available from 2017 through 2020. A LOT happened in the last 2 years, of course. How did the pandemic affect law firms? What about the forgiveable loan program called the COVID-19EIDL, which many solos and small law firms received? Did that have an effect on the business type change? All interesting things to ponder.

Analyzing Twitter Data using PowerBI

A popular software that can analyze data is Microsoft’s Power BI. Many people call it ‘Excel on steriods’, which is an easy way think of it.

I’ve been going through bootcamps/tutorials/courses to better learn PowerBI because I believe it is a great solution for businesses, which includes law firms and legal tech companies. We are in the midst of a data storm with more and more data being created every day. I’ve repeatedly heard people saying they know their company has (or produces) a lot of data, but it is “messy”.

Messy data is a whole other blog post. For now, I’m going to start posting examples of charts, tables, and dashboards I’ve created after analyzing data using PowerBI.

And the first one has to do with looking at data from my Twitter account, @SaraKubik.

Here is the final dashboard:

Here are the steps I took:

  1. In Twitter, go to <More, and then <Analytics.

You’ll be presented with a new page, and I encourage you to explore Twitter’s own activity measures. They are quite good. I want to download data from Twitter so the next thing I do is….

2. At the top of this page, click on <Tweets

3. I changed the button that says “Last 28 Days” to various months.

4. I then downloaded the data by clicking on the “Export Data” button and selected “By Tweet’. Remember where you’ve saved each of these csv files.

5. Now fire up PowerBI, or more specifically, PowerBI Desktop.

Power BI Desktop is part of the Power BI product suite. It’s free and available for download here:

6. Open Power BI Desktop and select <Get Data.

I’m not going to explain in detail what I did in Power BI Desktop. Overall, I imported the four csv files, merged them together, transformed the data, analyzed it, and created various charts, plots and text. I placed them on an interactive dashboard so a user can click on the charts and plots and drill down for more details.

My goal when creating any data visualization is to present material clearly. I’m not one of those designers who over designs. I want the users to be able to understand the visuals, not be intimidated by them, and be able to gleen answers and conclusions off of what I’ve created. I can talk geek all day long. But in the end, I believe visuals need to be easy to understand.

7. The final step I took was to publish the dashboard.

Here’s the thing, that’s not a free service. You’ll need to purchase Power BI Premium in order to publish your Power BI Desktop reports. There are other benefits when you purchase Power BI Premium, and it’s not terribly over-priced (I’m doing the $10/month plan).

I hope you’ve enjoyed this post. I’ve enjoyed making it because I love challenges and learning new things. I’m going to start posting more Power BI material soon.

Time-Series Analysis of New Covid Cases

I used two different time-series libraries in a Python time-series analysis model to predict new Covid cases in the United States.

Data was taken from ‘Our World in Data‘,, with showed, in part, daily numbers of new Covid cases for the United States from January 23, 2020 – January 9, 2022.

The first model used LinkedIn’s time series algorithm called Silverkite, and LinkedIn’s library called Greykite. (What’s with the kites?)

The model forecasts new Covid cases in the U.S. 90 days ahead of today’s date (which was January 10, 2022, when I created this post), with a prediction interval of 95%.

The results from this analysis are as follows:

The second model I created also used LinkedIn’s Silverkite algorithm, but instead used Facebook’s Prophet as the library to predict new cases. When you run the Python commands (I used a Jupyter Notebook), you’ll get warnings that Prophet will disable yearly and daily seasonality unless you change a setting. I kept the default because I didn’t want to take seasonality into consideration at the first running of the model.

The results from this analysis are as follows:

So what do you think of these different predictions? I think both models are going to be wrong in actual numbers of new cases because of the rise of the Omicron variant. And this dataset did not differentiate between the covid strains. Omicron is reported to be easier to spread but, overall, affects people less severly than the other covid strains.

But one thing that I find interesting is how aggressive each model is in forecasting new cases. Greykite shows drop in cases after the train end date (Jan 9, 2022) that continues through February 2022, but then it quickly moves back up. Prophet also shows a drop after the train end date (Jan 9, 2022) but makes a slow upward trend.

I hope Prophet is right, but we will see.

Another I wanted to do was compare my models with the data posted on the NY Times about U.S. cases. I did a screen capture and, using Photoshop, overlayed the new reported cases data onto my Greykite model. (Please note: I taught Photoshop for a long time and fully believe using this tool is an important way for data scientists to better communicate their findings.)

There results are as follows:

The NY Times data ended on Jan 8th while the ‘Our World in Data’ ended on Jan 10th, but the number of new cases is a nice match between the two sources.

I’ll keep watching to see how accurate my forecasting models are at predicting new cases of Covid in the U.S. I’ve posted my Jupyter Notebook and csv files on my Github, located here.

Pill Identifier: A way to identify pills using a non-internet-connected phone

The extended write-up is on here:

This is the shorter write-up for my blog.


Have you ever taken a pill out of the container it came in and immediately forgotten the name of the pill? I have, especially when I organize the pills I need to take each day into a daily container.

Once my pills are in here, I never remember what each one pill is.

Once my pills are in here, I never remember what each one pill is.

I take a lot of vitamins, so I don’t worry too much about when I take my pills or in what order. But what if you’re one of the millions of people who takes multiple pills each day for things like diabetes, cancer, high blood pressure, and other medical issues? It’s probably critical that you organize your pills according to how your doctor tells you to take them. Mixing up pills, taking them at the wrong time (time of day/night, with/without food, with/without other medications), and generally not knowing what pills you have once they are removed from the pill bottles are serious issues.

My mother takes about 20 pills a day. I am worried that she won’t always know which pills she has to take, so I looked at pill identification apps and websites. Every one of them requires you to be connected to the internet.

And that’s a problem for many people. It may come as a suprise to you, but there are millions of people in the U.S. who do not have any type of internet connection at home. The largest group is older adults. (See for more information.) And that’s the group that is most likely to take medication.

So I set out to build a pill identifyer that can run on a phone when it is not connected to the internet.

We all have used devices that are not connected to the internet. Even our phones when they are on airplane mode. Technicially, if a device can run computations without the need for an internet connection, you could call it an edge device.

Devices that run on the edge can include your phone, wearable devices like smart watches, sensors, autonomous vehicles, and other IoT devices. Simply stated, edge computing can be done on the device itself; you don’t need to send information to the cloud.

That’s an enormous benefit because edge computing reduces latency (delays) when it is processing information. For my purposes, I’m most interested in the ability of edge devices to not be connected to the Internet because of the problems I mentioned above. Millions of people do not have reliable internet, or high speed internet, or any internet at home. Yet there are ways to build solutions based upon running computations on edge devices.

Project building

1. To create my dataset, I borrowed 6 pills from my mother. I chose to only photograph 6 different pills of brandname drugs for this project because they were ones I could photograph myself; I had access to them.

Trying to find multiple copyright-free photos of the same pills to use in a machine learning model was impossible. I also had spent a day looking online on webmd at various pills. Silly me, I didn’t know just how many different shapes, sizes, colors, and identifying marks each pill had.

So I decided to photograph the pills myself.

The six pills I photographed for this project.
The six pills I photographed for my project.

When thinking about my dataset, I thought about how a person would hold their phone when using it to classify their pills. They would probably place the pill on a plate, or on a countertop. So the backgrounds would be varied. They may or may not have the pill centered in the frame. Focusing on the pill would not be a given. They may hold their phone at different distances from the pills. There will certainly be a difference in the lighting captured and the cast shadows produced from the light source.

I decided I needed a lot of photos. There were several ways I obtained photos of the pills…

2. Photographing pills using my phone: I used my phone to photograph each pill many, many, many times. I placed the pills separately on different countertops, at different angles, at different sizes in the frame, at different locations in the frame. I then brought these images into Photoshop to see how they looked. I saved them as png files and uploaded these images to Edge Impulse.

3. Data augmentation: I wrote Python code in Jupyter Notebooks to perform data augmentation on the images I took of the pills.. I rotated images, moved them around in the frame, changed the brightness and darkness, blurred some, and added noise to others. These files were also saved as png files and also uploaded to Edge Impulse.

4. Directly uploading photos to Edge Impulse: Edge Impulse allows you to connect your phone to your project, and then use your phone to upload photos to their website. These photos were ultimately saved as jpgs.

In the end, I wound up with 1189 photos for my training and validation dataset, and 305 photos in my testing dataset.

Pill on white background
An example of a photo image in my dataset.

4. I then built my model in Edge Impulse. After a lot of work on my training, validation, and testing various models and looking at accuracy rates, I landed on the following for my neural network model:

Keras neural network image classification model that had 100 epochs, a 0.001 learning rate, and multiple layers.

This model resulted in an accuracy rating of 87.8% and a loss of 0.47. The confusion matrix was ok and I knew I may have problems with a few of the pills based upon what it showed me.

I chose to create a custom deep neural network instead of using a pre-trained transfer learning neural network because the accuracy was much better on my custom network. And my model performed better when the features of the images were kept as RGB instead of converting them to grayscale.

Project Testing:

5. Testing on my testing dataset: Of the many bootcamps I’ve taken, I’ve heard people say they’ve tested their machine learning model by taking a separate part of a dataset and removing it completely so the model does not use the data to train on. Then, this second dataset is used after a model is trained to “test” the model.

Edge Impulse lets you “test” your model in a way similar to how you’d test it if you were writing straight Python code. My testing resulted in an accuracy reading of 88%, which was pretty good. But the real test was deploying this model to my phone and using my phone to classify the pills.

6. Real world testing: I placed the pills on a countertop and used Edge Impulse to connect my phone to it. I needed the Internet here because Edge Impulse was hosting my files. They packaged up all of the files and built an app the works in a phone’s browser. Once the files are deployed, though, you can disconnect from the Internet and the classification still works! Your phone is being used as an edge device running a machine learning model without having to send information to the cloud (and back) in order for it to work.

There is definitely room for me to try to improve my model’s accuracy, expand to other edge devices, expand the number of pills that can be classified, etc. But overall, I’m happy with the results of the project and learned a lot from creating it.

If you want to look at the Python files used in this project, please visit my Github repo here:

First impressions and shiny things

Like you, I’ve heard about artificial intelligence before. I’m a big science fiction movie fan and prefer it over rom-coms. But lately, the terms “artificial intelligence”, “machine learning” and “neural networks” seem to be cropping up over and over in my Twitter feed. I have never been one to jump into hyped tech because a lot of it can lead you down rabbit holes where you end up with something shiny, but not useful.

So I approached with caution… this was what I kept hearing in my head, with Dorothy’s voice saying, “Artificial intelligence and machine learning and neural networks, oh my!”

I signed up for bootcamp courses online, both free and paid. I watched videos and read content. And one thing I learned is that much of this new terminology is a re-hash of old terminology.

Whispers… machine learning is often just plain ole statistics.

Yes, there are changes that are new, and most of what I’ve found so far is the amount of data that can be used now can be huge (big data is the buzzword here). Also, there is a lot of free stuff or open source content. So the knowledge you can obtain about shiny new tech is not behind ivory towers. However, you may have to sift through content because a lot is junk, or old, or not helpful in any way.

So I’m learning again, and that makes me happy. I’ll post more about artificial intelligence, machine learning, and neural networks because I *DO* believe in their usefulness. I do think they are more than hype. And just like Dorothy, I say, “Oh my!” but now in an excited-eager-confident way.

Older Adults’ Internet Access

I wanted to build off of my dissertation by focusing on older adults’ uses of the Internet. But I didn’t want to collect my own data so I found a dataset onine called the Core Trends Survey. It is available from Pew Research here: .

The Core Trends Survey was conducted Jan 8 – February 2019. The original sample had 1502 adults. I downloaded the csv dataset in Excel (because the sample size and file sizes were small) and only wanted older adults ages 55 and up.

My final sample was 465 older Americans, ages 55 and up.

  • All used the internet or email, at least occasionally
  • All had either a smart phone, a tablet, or a computer or a combination of these devices All had either high-speed internet access (like DSL,cable, fiber optic) or cell phone or tablet access at home (no dial-up). So either high speed or cell connection (note the difference of these with latency and fog computing in the future)

I cleaned the data, created new variables, and analyzed it using Excel and R. I created some image files using various packages in RStudio, and saved some stacked bar charts as pngs. These were a high level overview of my findings.

My Github has the CSV files, the R code, the pngs, and a read me file:

I was not happy with the stacked bar charts, so I went to Tableau to see if it could produce something better.

I created data vizualizations and a dashboard using Tableau. The files are…

  • Breaking apart where the older adults lived:
  • Separating the data by gender and technology ownership:


It’s been a while since my last post. A LOT has happened since then. Most noteably, the pandemic. Like the rest of the world, my life was radically changed because of this. It was a period of triage for many of us, and that resulted in my radio silence on this blog.

But, I’m back.

What has not changed is that I’m still an Illinois-licensed attorney mainly focusing on family law. I own my own law firm and that has allowed me the incredible flexibility to be with my children. I can always work more; I can never get back the time I have with my children as they grow up.

So this has re-focused me on my career path. I will always have my law license and actually aim to get licensed in more states when I can. I love learning. The more I learn, the more I want to learn more. (LOL at my grammar but you get my point!)

One common area that I’ve always been interested in is technology. Technology changes so rapidly and that is exciting to me. Throughout my time as a lawyer, I’ve been learning about new technologies and new applications of the technologies. New development. I’ve even taken online courses to keep my skillset current and understand how I can use these tools (used by data analytics and data scientists) in combination with what I already know (stastical analysis, research, content creation, the law).

So you’ll see a change in the posts I’m going to make. They’re going to be more data-focused because that is what I’m into now. I’m exploring data analytics, artificial intelligence, machine learning, and big data. They are technical and can be complex, but I aim to be able to explain my thoughts as clearly and efficiently as possible.

How-To Tutorials for Unrepresented Litigants

I’m making a list, you’re checking it twice. We’re going to get you through the divorce process without having to hire an attorney. That’s nice!

I am a family law attorney who represents people in Chicago and the surrounding counties going through divorces and child custody lawsuits. When an attorney is hired by a Petitioner (or Plaintiff) or Respondent (or Defendant), that attorney normally only represents one side in the lawsuit. Not both sides.

Yet I’ve encountered many cases, especially in divorcing couples, where the party I represented WANTED me to work for her (or him) AND the opposing party. Maybe they thought hiring one lawyer means they pay one attorney to do everything, for everyone, in the lawsuit. 

This can be very problematic to say the least.

I had a pro se/unrepresented opposing party shout at me (while in the courthouse, no less) to “do all the f***in paperwork”. His wife, my client, supported this and she, too, ranted to the Judge that she only hired me to “do paperwork”. 

The legal divorce process is more than just “doing paperwork”, so it was obvious that both parties had no idea what an attorney does, or that some attorneys, MOST attorneys, will not represent both sides in a lawsuit.

I used this recent hailstorm as motivation for developing something to help me, opposing parties, other attorneys, etc.

First, let’s look at the problems:

Problem 1: The divorce process is not clear.

Going through a divorce and working with the Courts, judges, and clerks can be a daunting process. One person I represented asked where the checklist was for her to file for divorce by herself in Cook County, Illinois. There wasn’t one.

Problem 2: The e-filing process is not easy to understand.

Although e-filing has eliminated the need to travel to the courthouse and physically give someone legal documents, the e-filing process is not easy to understand.

We live in an age where we expect to be able to instantly learn and understand how to navigate around websites. Shopping online, watching videos, reading online news content… we expect to be able to quickly know how to use their sites. When a person is unable to understand how to navigate through a site, they will abandon it.

Yet people who e-file MUST use the sites. Because in Illinois, all legal documents must now be e-filed.

When you combine problem 1 and problem 2, you normally get frustrated and angry pro se litigants. Sure there are websites that contain pdf forms that these litigants can download, but there is a lack of good, clear educational content on:

A. How to understand what these forms mean

B. How to fill them out

c. How to e-file them

I was a professor who taught students software, and I now am an attorney who has helped many people get divorced in Illinois. So I made some videos for pro se litigants going through divorces in the Illinois courts.

The first few videos are free and are on YouTube:

1. What is an Appearance and How to Fill it Out:

2. How to e-file an Appearance:

The next set of videos is bundled into an online course. I’ve created a series of videos for someone who wants to file for divorce in Cook County, IL. The course is (creatively) entitled “How to File for Divorce in Cook County, IL” and is priced at $64.99. It’s located here. If you click that Udemy link, you can preview a lesson and see how I’ve broken the course down. I’ll walk you through how to fill out all the forms you need to start a divorce lawsuit in Cook County, IL. I’ll also provide you with the forms, and show you how to e-file them.

Finally, if you have legal questions and Google your legal question, you will get flooded with results. And so much of those results will not pertain to you AT ALL. Divorce and family law is different in every state. But Googling results does not necessarily tell you that. People get overwhelmed by the sheer volume of information online. I’ve learned that lawyers often have really good blogs with helpful content, but our blogs do not end up on page 1 of a Google search. So if you’re reading this, certainly check out the rest of my blog. But head over to another Chicago divorce attorney’s blog that I think is super informative, Attorney Russell Knight churns out content quite frequently. And I’m all for boosting another attorney’s work that can help demystify the legal process.

Can law firms operate without having a law office?

When I was an attorney at various family law firms, I rarely met with my clients in my office. There just wasn’t a need for a client to travel to my office in order for me to work with them. I called them. We emailed a lot. We met at court before and after any court appearances.

When I formed my own solo practice, I wondered, “Why do family law firms still have physical offices when most of what we do with clients can be done remotely?”

Maybe law firms have formal physical law offices because that is as it has always been. The stereotypical image of a lawyer is someone standing in front of a row of law books, crossing their arms in front of them and looking serious. And what is true is that the legal profession does not accept change very quickly.

But you know who pays for law offices? Those beautiful offices with leather chairs and mahogany furniture? Ultimately, the client does.

law library
Who ends up paying for a law office? The client does.

If it’s not necessary for a law firm to have a law office, and if the cost of having a law office is something the clients end up paying for, let’s get with the times and operate in a more streamlined way.

Lawyers are providing a service to their clients. Who else provides services? Plumbers, electricians, teachers. You get the idea. Unless you need to buy a product from these service professionals, they do not need public offices.

However, if you want to meet with them, you can. You can meet with an attorney in an agreed-upon location. And how much more convenient is it to meet with your attorney in an area closer to you? 

Driving in an urban area can be hard to do, costly, and there may be a large distance between you and the law office. Or maybe there is a short distance but the travel time is still big (especially during rush hour commutes).

My law firm is called Kubik Legal. It is a family law firm so many of our clients have younger children. We know that you have familial obligations and have taken this into consideration regarding how we work with our clients.

Because we don’t have an office, we don’t have office hours! So we’ll work with your time schedule when we represent you. Whenever we can, we’ll work in a way that you prefer to communicate.

When you work with a lawyer, you will develop a professional relationship with them. So much of what we do involves communication. And in today’s technology-driven world, we can communicate in different ways. The idea that a legal client has to always drive into a law office in order to meet with their lawyer is outdated. Plus, it’s probably going to cost a legal client more when they work with a law firm who operates in this way. Consider working with Kubik Legal. We are an efficient and effective law firm.