Fetching Data in UiPath

It All Begins with Data

I started my career in IT developing database solutions using Microsoft Access. I quickly learned that under the hood, every program is just a high-speed data processor. Every program takes data as an input, does something to it, and produces output. No matter how it is presented to the user, whether a spreadsheet, a form, paragraphs, reports, images, emails, or something else, there is data behind it.

I loved organizing data. When I talked to clients about what they wanted the system to do, I’d be talking in English, but thinking in data. I’d have dreams about tables and relationships. I learned to always start with data and I still think that way.

Let’s look at UiPath from a data layer perspective. UiPath has so many built-in Activities to fetch, manipulate, and output data that it’s easy to lose the forest through the trees. In this post, let’s focus on the input side of a process. Let’s categorize the most common types of data sources that UiPath can use as input, and some of the common UiPath Activities available to fetch each type.

The Data Journey

No matter the type of data source, the journey of a piece of data is the same for every process. First and foremost, a connection to the responding system is established, if it’s not local. Second, a command is given to fetch the data. The request may take many forms: an HTTP request, a read file command, or a form presented to a human, to name a few. Every system has a way to receive these requests and reply to them, (even humans). The asking system then waits for a response. Assuming it did not timeout, the responding system bundles up the response and sends it back to the requestor in some pre-defined format. Once the data is received from the source, it is temporarily stored in memory, to pass along to the rest of the process, where the data will be transformed in some way to produce the output.

Types of Data Sources in UiPath

User Input

There’s just no way around it, sometimes we need humans to answer questions. The most straightforward way to get an answer from a human is to present them with some kind of a form, and wait for them to hit submit. Humans can handle very simple forms, but are unfortunately very error-prone, which must be anticipated and handled. Input screens must be presented in the simplest way possible.

When fetchingActivityReturned AsTips
Single InputInput DialogString
Multiple InputsShow Form, Get Form Values
[Form.Activities]
Dictionary (Of String, String)use global variable
to store form values
System Info

Another source of data is the computer itself. Computers are good at knowing the current date, time, timestamps, starting a timer etc. You can fetch system information such as the current user, file information, and the current environment variables. This is normally done through code, which you can assign to a variable. Here are a few VB examples.

When FetchingCode or ActivityReturned As
Current DateDate.TodaySystem.Date
TimestampDateTime.NowSystem.DateTime
Current UserEnvironment.UserNameString
Domain NameEnvironment.UserDomainNameString
File InfoGet File Info ActivitySystem.IO.FileInfo (Object)
Current DirectoryEnvironment.CurrentDirectoryString
Folder InfoGet Folder Info ActivitySystem.IO.DirectoryInfo (Object)
Spreadsheets

There are many advantages to using spreadsheets to store records of data. Basically a stand-alone table, a spreadsheet file is easy to read, understand, update, and share. Ubiquitous and user-friendly, most users today already have the skills to work with spreadsheets. Spreadsheets are ideal for smaller tables, and for projects where the user needs to see the data. Spreadsheets do have their drawbacks, however. There are data limits with spreadsheets, and they are not a good choice for big data (over a million rows). There are several methods in UiPath to fetch data from spreadsheets.

SourceActivityReturned asTips
ExcelExcel Process Scope > Use Excel File
> Read Range
[Excel.Activities]
DataTableExcel must be installed;
File must not be open
ExcelRead Range Workbook
[Excel.Activities]
DataTableExcel installation not needed
Excel (Office 365)Read Range
[MicrosoftOffice365.Activites]
DataTableslower than using local files
Excel (ODBC)Run Query
[Database.Activites]
DataTableProvider: “System.Data.Odbc”
Google SheetsRead Range
[GSuite.Activities]
DataTable

When fetching data from a spreadsheet, UiPath Activities return the data tucked inside a DataTable Object variable, which has its own syntax. Learn more. Care must be taken to validate the data, to ensure that data is not missing and is in the correct format and type. Sometimes the data must be manipulated to get it into the correct type. The data type of each column is automatically set when data is returned (usually a string or general), but you can loop through the DataTable variable and convert the data using a conversion function. Use the Invoke Code Activity and pass in the DataTable variable as an argument. Here’s an example:

'to open data table and loop through
dtMyTable.AsEnumerable().ToList().ForEach(
   Sub(rs)
      rs("Date")=CDate(rs("Date"))
   End Sub
)

Instead of the above code, however, I highly recommend a package called BalaReva.EasyDataTable.Activities. It contains several Activities to make manipulating data in a DataTable variable much easier. It gives the developer the ability to view the table after it’s imported, and convert the data type of a column if necessary. Learn how to use it here.

Emails

Email has been around a long time, and there are now a myriad of protocols, applications, and methods used to deal with it. It can all get a bit confusing. For now, let’s focus on the most common email applications, and the latest, easiest ways to fetch a bunch of them.

SourceActivityReturned AsTips
GMailGet IMAP Mail Messages
[Mail.Activities]
List<MailMessage>server: “imap.gmail.com”
pwd: set up app password for UiPath
port: 993
IMAP must be enabled in GMail settings
GmailGet Email List
[GSuite.Activities]
List<GMailMessage>
OutlookGet Outlook Mail Messages
[Mail.Activities]
List<MailMessage>
Outlook 365Use above now

Once you get a list, you can loop through it and fetch the details of each email, such as the Subject, Body, Sender, etc. You can also see if there are attachments and download them as well.

Word Processing

UiPath can extract text from various word-processing documents, such as Microsoft Word, Google Docs, and PDFs. Usually only the text is imported, not the formatting. When using OCR to extract text from images you must select an OCR Engine.

SourceActivityReturned AsTips
MS WordWord Application Scope >
Read Text
[Word.Activities]
StringMust have
Word installed
Google DocsGet Text
[GSuite.Activities]
StringTo extract a section,
it must be defined in
Doc as a section break
PDFs (browser)Get Text
[UIAutomation.Activities]
StringFor UI Automation project
as background automation
if PDF is digital
PDFs (native)Read PDF Text
[PDF.Activities]
StringIgnores images
PDFs (scanned)Read PDF Using OCR >
[Select OCR Engine]
[PDF.Activities]
StringBoth text & images
Slower
Websites

Extracting data from websites, also called data-scraping, web-scraping, or screen-scraping, is kind of UiPath’s thing. You can extract text, attributes (including image urls), and tablular data from any website using UiPath’s UIAutomation.Activities package. Start with the Use Application/Browser Activity to set the URL. Save it to a UiElement variable (like uiOutput) for use later.

TypeActivityReturned AsTips
TextGet TextString
AttributeGet AttributeStringto see list of attributes use UiExplorer, Inspector,
or loop through uiOutput.Attributes
VB Code: uiOutput.Get(“attributeName”)
Table DataExtract Table DataDataTableuse Table Extraction wizard in Design
ImageGet AttributeString (of URL)find img tag in Inpector to see if url is
in “src” or “srcset” attribute
Databases

Databases are specifically designed systems used to store, organize, and provide access to data. As the Internet has evolved to offer more and more information, databases have also evolved to store and retrieve massive amounts of data using different structures and methods. Databases range from Access or Foxpro sitting on your computer, to BigData tables that power Google. There are relational databases and unstructured, or NoSQL, databases. Most databases are secured, and read access must be granted by the database administrator in order to retrieve data. Each type of database uses a specific driver. A driver, or provider, is a translation software that sets a communication protocol between the requestor and the database. Sometimes it’s necessary to download the driver. Then, a connection string is created that holds details such as the driver to be used, the name of the database, username and password.

In UiPath, database Activities can be found in the UiPath.Database.Activites package. Some experimentation may be required to connect to your specific database. UiPath offers a connection string wizard to help with building and testing the connection string, and includes many standard drivers. Once connection is established, the Run Query Activity is used to tell the system exactly what data you want, using standard SQL such as “SELECT CustomerId, CustomerName from Customers”. All data is returned as a DataTable variable.

API Calls

API stands for Application Programming Interface, and is a way to get just the data out of a website, without the presentation layer you see on your browser. Think of it as a back door to the data. API programmers create several ways in, called endpoints, which receive calls to their server directly, and return just the data. Programmers use APIs all the time to refresh their content, to update a map, to save something to the database, etc. Some APIs are private, and allow access only if you have a special authorization key of some kind. Others are open, and allow anyone to make the call.

There are two approaches used for APIs: SOAP or REST. The main difference is the SOAP (Simple Object Access Protocol) returns data in XML format, while REST (Representational State Transfer) can return the data in JSON, HTML, XML, or plain text. Of these JSON is the most common. Websites almost always have API instructions published on their site, which gives other programmers the details of their endpoints, formats, and data returned. It is important to review these instructions to set up your call correctly. An API call is basically just a URL string. For example, if you copy and paste this into your browser, you’ll see data returned.

https://reqres.in/API/users?page=2

To make an API call in UiPath, you need the package UiPath.WebAPI.Activities. The HTTP Request Activity has a configure button that launches a form where you can enter the endpoint, the authorization key if needed, and different parameters, and test the returned data. Returned output is split into the status code (an integer – 200 means success) and the returned content (a string in whatever format).

Once received, the content string must be parsed into something useful. This is called deserialization. You can then drill down into that object to pick out the data you need. For example, the above API call returns this JSON-formatted string, lets save it a variable strOutput.

{
"page": 2,
"per_page": 6,
"total": 12,
"total_pages": 2,
"data": [
{
"id": 7,
"email": "michael.lawson@reqres.in",
"first_name": "Michael",
"last_name": "Lawson",
"avatar": "https://reqres.in/img/faces/7-image.jpg"
},
{
"id": 8,
"email": "lindsay.ferguson@reqres.in",
"first_name": "Lindsay",
"last_name": "Ferguson",
"avatar": "https://reqres.in/img/faces/8-image.jpg"
},
{
"id": 9,
"email": "tobias.funke@reqres.in",
"first_name": "Tobias",
"last_name": "Funke",
"avatar": "https://reqres.in/img/faces/9-image.jpg"
},
{
"id": 10,
"email": "byron.fields@reqres.in",
"first_name": "Byron",
"last_name": "Fields",
"avatar": "https://reqres.in/img/faces/10-image.jpg"
},
{
"id": 11,
"email": "george.edwards@reqres.in",
"first_name": "George",
"last_name": "Edwards",
"avatar": "https://reqres.in/img/faces/11-image.jpg"
},
{
"id": 12,
"email": "rachel.howell@reqres.in",
"first_name": "Rachel",
"last_name": "Howell",
"avatar": "https://reqres.in/img/faces/12-image.jpg"
}
],
"support": {
"url": "https://reqres.in/#support-heading",
"text": "To keep ReqRes free, contributions towards server costs are appreciated!"
}

The UiPath Activity Deserialize JSON accepts as input a JSON-formatted string, and converts it into a JSON object. So we enter our previous variable, strOutput, and save the converted output to a new variable. Lets call it objJSON.

Now, we can actually get our data:

Single value

objJSON("total")

Single value embedded

objJSON("support")("url")

Assign Records into a DataTable Variable

In Javascript, arrays are enclosed in square brackets, and objects are enclosed in curly brackets. We can see from the result above there is an array called “data”, which has a list of objects. So let’s take a shortcut and save just save these records into a DataTable variable using an assign activity. The method we will use is found in the newtonsoft.json namespace, so make sure this is imported into your project.

newtonsoft.json.JsonConvert.DeserializeObject(Of Datatable)(objJSON("data").tostring)
Integration Service

We can see from the API Calls section above that a bit of programming is required to get the data into a format that we can use. It is necessary to read and understand how each API has been exposed, and how the call must be crafted. Plus, the API can change, which might break any program that calls it.

To try to make it easier, UiPath has created its Integration Service. Here, the most common 3rd-party APIs are pre-programmed into objects called connectors. As of this writing, there are 144 connectors in UiPath’s Integration Service and more being added all the time. There are connectors for sites such as Microsoft One Drive, Microsoft 365, Salesforce, GMail, Google Drive, AWS, PayPal, Slack and many others. It is up to the programmer of the connector to make sure it is kept up-to-date with the latest API. This should solve any breaking issues, but keep in mind that when using the Integration Service, you are outsourcing its maintenance.

Integration Service handles several things for you: the connection credentials, the API call, and the return format of the data. Another cool feature is triggers. Triggers open up a two-way communication between the site and your automations. With triggers, the 3rd-party site itself notifies UiPath that some event has happened, say an email has been received, which you can subscribe to and then launch an automation. The trigger passes the notification to your process in the form of arguments. In this example it would send the objectID of the incoming email, not the email itself. In your process, you could then fetch the email by its objectID.

To use Integration Service to fetch data in Studio, the activity package corresponding to the IS connector must be installed into the project. Most of the latest UiPath official packages are written to correspond with the latest IS connector. Details can be found in each individual connector documentation. Keep in mind that connections are folder-specific. So you could theoretically have different credentials for different folders. So make sure that your Studio instance is pointing to the right folder when you use Integration Service activities.

Storage Buckets, Assets & Queues

In Orchestrator, UiPath has folder-specific storage areas reserved for holding data for use in processes. Unless limited, users and robots with access to the folder will be able to access this data. There are 3 types of data storage available: Assets, Storage Buckets, and Queues.

Assets are key-value pairs stored at the folder level. Each asset has a name (a key) and a value that can be text, boolean, or integer. Another special type of asset is the Credential Asset, which stores both the UserID and Password. All assets are encrypted. Assets are set up in Orchestrator with their values, although can be set from within the process as well.

Storage Buckets are basically just an online, shared folder used to organize files that many processes can use as input. Any files, such as datasets for Machine Learning, Excel files, PDFs, even testing data files can be stored here. Storage Buckets are set up on a per-folder basis, and usually have the same permissions as the folder. Files are uploaded to them from within Orchestrator. Txt files can be read into a process, but most files must be downloaded for further processing depending on its type.

Queues are a special type of storage used for processing records of data. Queues are set up in Orchestrator with a specific schema. Once a queue is set up, a separate process called a Dispatcher is in charge of loading up the queue with the records. Then, another process called a Performer actually fetches these records from the queue to process them. One advantage of using Queues is built-in validation and error handling. Each record is tracked whether it has been processed successfully or not. Thus Queues are useful for large data sets, where something could go wrong, and can failures can be handled and reviewed. Queues can also be accessed by more than one machine, which can distribute the load of processing the items in the queue. NOTE: Queues are not to be used like databases, with updateable fields. They are intended as read-only, process-once then delete transactional processing.

For this post, let’s assume that the resources already exist in Orchestrator and let’s focus on the fetching activities used for each type. All these activities exist in the Core package.

TypeActivityReturned AsTips
AssetGet AssetString, Boolean,
or Int
as set in Orch
Easy place to store URLs and path strings
Asset CredentialGet CredentialUID: String
PWD: Secure String
Storage BucketRead Storage TextStringfor txt files
Storage BucketDownload Storage FileILocalResourceIn project create folder to hold downloads
Put “foldername/filename.ext” in
File name and location property
Will overwrite
Save result into variable varOut for later use
varOut.LocalPath points to file location
QueueGet Queue ItemsIEnumerable<QueueItem>Filter by status (usually New)
Returns first 100 records
Does not change status automatically
Use For Each to loop through result
To get values: QItem.SpecificContent(“fieldname”)
QueueGet Transaction ItemQueueItemAutomatically sets status to “In Progress”
To get specific record, queue must have unique Reference field
To get values: QItem.SpecificContent(“fieldname”)
After processing Set Transaction Status to “Successful” or “Failed” and put in the reason

Document Understanding (AI)

UiPath.IntelligentOCR.Activities package

App Integration > Document Understanding