The original task description of the Discovery Challenge states:

The bank wants to improve its services. For instance, the bank managers have only a vague idea, of who is a good client (whom to offer some additional services) and who is a bad client (whom to watch carefully to minimize the bank losses). Fortunately, the bank stores data about their clients, the accounts (transactions within several months), the loans already granted, and the credit cards issued. The bank managers hope to improve their understanding of customers and seek specific actions to improve services. A mere application of a discovery tool will not be convincing for them. In keeping with the original task description, our project goal is to mine and analyze this bank data in order to extrapolate from it the type of customer who makes a good candidate for a credit card.

Domain Description

This database is a collection of financial information from a Czech bank. The dataset deals with over 5,300 bank clients with approximately 1,000,000 transactions. Additionally, the bank represented in the dataset has extended close to 700 loans and issued nearly 900 credit cards, all of which are represented in the data.

Data Description

Entity-Relationship Description

📌Each account has both static characteristics (e.g. date of creation, address of the branch) given in relation "account" and dynamic characteristics (e.g. payments debited or credited, balances) given in relations "permanent order" and "transaction".

📌Relation "client" describes characteristics of persons who can manipulate with the accounts.

📌One client can have more accounts, more clients can manipulate with single account; clients and accounts are related together in relation "disposition".

📌Relations "loan" and "credit card" describe some services which the bank offers to its clients;

📌More than one credit card can be issued to an account,

📌At most one loan can be granted for an account.

📌Relation "demographic data" gives some publicly available information about the districts (e.g. the unemployment rate); additional information about the clients can be deduced from this.

domain1.gif

Table Descriptions