Analysis of Airbnb Data — Buenos Aires, Argentina

--

It is no secret that traveling is one of the most enjoyable and sought after activities by many people. Besides, when we join a lodging and new experiences form a perfect set. Thinking about it, Airbnb has several points and can be considered one of the largest hotel companies today. Always connecting hosts to those who enjoy the pleasure of traveling. Ah, the detail is that it does not have any hotel!

Airbnb was founded with a startup in 2008 and by the end of 2018, ten years after its creation, it had surpassed the number of 300 million guests. The image below represents Airbnb’s layout at the beginning of the startup.

Representation of the initial Airbnb layout.

Thus, we analyzed the data regarding the city of Buenos Aires, Argentina made available by the company through the Inside Airbnb portal. The data used in the analysis can be downloaded by clicking on the following Link. In addition, it is worth remembering that the data used for this analysis is the summarized form of the data and there is a larger form in the repository.

Buenos aires — Argentina

The city of Buenos Aires was founded in 1536 and attracts the attention of many Brazilians because it receives daily flights from São Paulo and Rio de Janeiro. In addition, many Brazilians choose the city as their first international trip because of the range of possibilities it presents, either by the countless places or by the tango-filled nights. As I would like to visit it, I took the liberty of managing a list of places to visit in the city as follows (1) Temaiken Biopark: This is a zoo that interconnects South America, Africa and Asia in a unique way. (2) Tango Shows: Buenos Aires, also known as the tango capital. I believe it should be a spectacular attraction. (3) City Tour: A fun and alternative way to know a good part of the city in a fast and interactive way.

For this tour all are needed a few days in the city, nothing better than the Airbnb hotel. This way, I made some analyses to help in the choice of the best place in Buenos Aires.

Explanatory Analysis

Cleaning was done in the dataset, as well as data treatment in overfitting — it corresponds to different values of the others and hinder in the analysis — and we have reached some conclusions that we will explain in this article. To access the complete code of the analysis just access the Link. Therefore, we will divide the results as follows (1) relationship between type and price of property, (2) relationship between neighborhood and type of property, (3) mapping of prices and (4) final conclusion.

Relationship between type and price of property

In this topic we will (1) demonstrate the accounting of the types of real estate, followed by (2) the verification of the percentage of the type of choice of users and (3) evaluate how much each type of real estate influences the prices. Thus, we have collected and obtained the following results from (1) and (2), according to the table below:

In this way, we can conclude that most of the available properties are whole houses. Although these are the most requested, are they the most financially accessible?

To check (3) we will use the boxplot tool where we have the weights of each type, where these weights correspond to the variance — how far the values are from the mean — of the data in relation to the values.

Thus we have some partial conclusions as (1) the prices of hotels are the most varied, (2) the prices of entire houses are the second highest, although they are the most chosen due to greater convenience and privacy and (3) the best cost benefit are the private rooms, since these have the lowest allocation values and the second most requested in the ranking with 18.25% .

Relationship between neighborhood and type of property

For this analysis we will (1) account in a table the neighborhoods and their average prices, (2) evaluate the distribution of the average price of each neighborhood, (3) evaluate how the most chosen influences the price. For this purpose, (1) we have the table counting the first 10 neighborhoods ordered in relation to the average price:

It can be seen that the two neighborhoods with the highest price also have a very low percentage of reserve. To better evaluate (2) and (3) we need to generate a boxplot chart as below:

Therefore, we can generate some partial conclusions regarding the influence of the neighborhoods on the price of the reserve (1) as the Palermo neighborhood presents itself as the best choice, for being the most chosen and having a great variance in its prices showing to be financially versatile and its geographical positioning to be downtown. (2) If the Palermo neighborhood does not meet your needs, I advise you to look for the Recoleta neighborhood. Where it presents a high user choice, average price similar to Palermo but with a reduced variance and have a geographical positioning close to downtown. (3) If your choice is to evaluate prices the best neighborhood is the Floresta, this presents the highest variance of prices.

Mapping of prices and

To map the price I produced a graph with the latitudes and longitudes, as well as I assembled a color palette representing a quantity of price grouping in the coordinates and assigning the color given the price of that location. For comparison, I added the map of the Buenos Aires region to the side. Finally I obtained the following relation.

Final conclusion

Therefore, we can conclude that the best neighborhood to allocate is Palermo followed by Recoleta and if we want a greater price variation according to your needs we have the Floresta neighborhood. Remember, to access the full code of the analysis just access the Link.

--

--

Sebastiao Ferreira de Paula Neto

Data engineer with a passion for data science, I write efficient code and optimize pipelines for successful analytics projects.