Introduction
This page will be my notebook to save any thougts about investment knowledge. I will try to select investment estate in small town Avsallar using Python libraries and modern data processing approach.
I live in Avsallar town. It is located on south coast at 120 km from Antalya, Turkey. In 2021 here in Avsallar approx. 50 square meters apartments was selling for ~40-50 thousands euro. In late 2022 when the war in Urkain has began, Avsallar propery prices almost doubled in compare with 2021 prices. Town has now more than 50 new buildings and setes under construction.
It's a small town near Alanya with several 4 and 5 star hotels on the beach and hundreed so called sētes
. English pronouncing like in word 'see'. Sete is a fenced area with 24/7 guadians, hamam, pool, sauna and other infrastructure on territory. Tourists like to rent an apartment in those setes for a couple of months or whole touristic season time from May to October.
Season time always rises prices almost twice here. You can rent small apartment for one person for 400 euro in winter and 600-800 euro in season on Thu 06 Apr 2023 03:13:30 PM +03.
Antalya region has warm subtropics climate with +10C/+18C in winter from December to March and +25C/+35C during the rest of the year.
Dataset
Getting real world data for my investment calculator project turned out to be quite difficult. I need to parse some Telegram chats for ads and walk every real estate company in the town. I pretend thinking to buy something. For now I choose to get some data from real market. Actually most problematic part is the sold apartments because nobody publics this information. The local trading cultural aspect is that the ads has not the same price as the sold objects, usually selling prices are lower because the trade discount included by seller in the ads.
But I still have to get some real data to process. And here what I can use for that:
Dataset sources:
- Local real estate sites with basic sell information like object size, building type, location and so on.
- Telegram channels and chats for foreigners where people are looking for apartment to buy.
- Real estate companies and agencies with new projects and just built houses and setes.
Writing Web Scraping Program: RH agency site
I rented an apartment in local agency Rixos Homes and they have a web site with apartments sell advertising in a new buildings.
To start getting data from this site I need to create the URL to list all objects in Avsallar:
https://www.rixoshome.com.tr/en/search-results/?keyword&location%5B0%5D=alanya&areas%5B0%5D=avsallar&status%5B0%5D=for-sale&type%5B0%5D=apartment&rooms&bedrooms&bathrooms&property_id
Creating Python virtual environment:
mkdir ./rh && cd ./rh
python3 -m venv venv
. ./venv/bin/activate
pip3 install requests bs4
I see that there is no option to get all items in one page. This makes me to write a small Python program to save all pages in a list. It will stop if the web site will return HTML page with 0 Result Found
phrase.
import requests
pages = []
page = 0
while True:
print('Loading page', page)
url = 'https://www.rixoshome.com.tr/en/search-results/page/'
url += str(page)
url += '/?keyword&location%5B0%5D=alanya&areas%5B0%5D=avsallar&status%5B0%5D=for-sale&type%5B0%5D=apartment&rooms&bedrooms&bathrooms&property_id'
response = requests.get(url)
if '0 Result Found' in response.text:
break
else:
pages.append(response.text)
page += 1
print('All pages loaded.')
In the next part of program I'll try to extract an items from HTML pages which contains apartments' parameters.
from bs4 import BeautifulSoup
print('Extracting data...')
apartments = []
for page in pages:
soup = BeautifulSoup(page, 'html.parser')
items = [item.text for item in soup.findAll(class_=['item-body'])]
for item in items:
apartments.append(item)
print(apartments)
Lets see what the result items looks like. Here is one item from all apartments
list:
'\n\n\n\t\t\t\t\tFor Sale\n\t\t\t\t\n\t\t\t\t\tNew Costruction\n\t\t\t\t\n\n\nNew Apartments in Avsallar, Alanya\n \nStart from 135.000€ Avsallar, Alanya, Antalya, Akdeniz Bölgesi, Türkiye \nBeds: 2Bath: 178 m²Apartment \n\tDetails\n\n\nadmin \n\n\n\t11 months ago\n'
In this data piece not so much useful information I have.
- I can remove
For sale
becasue all apartments in that site query are for sale. - I can remove the address, because it has nothing certain about where exactly the house is located and it is the same for each item.
- The
Beds: 2Bath: 1
has not any splitter between each other and between the area in square meters which is the next parameter. Apartment
is a filter parameter and can be removed too. All items has the same in that list.
But still it has one valuable filed here: The New Apartments in Avsallar, Alanya
which is name argument for the Details
page link. It has a lots of information and explanation including where is the apartment located and list of columns with all parametes which are easy to extract. I just need to get Details
links for each item on each page, then download the HTML from the web site and do the same work with the bs4
library.
Rewriting the second part of program to get all details href
elements:
from bs4 import BeautifulSoup
print('Extracting data...')
urls = set()
for page in pages:
soup = BeautifulSoup(page, 'html.parser')
for item in soup.findAll(class_=['btn btn-primary btn-item'], href=True):
print(item.text, item['href'])
urls.add(item['href'])
for url in urls:
print('Loading', url)
response = requests.get(url)
soup = BeautifulSoup(response.text, 'html.parser')
for item in soup.findAll(class_=['property-detail-wrap property-section-wrap']):
print(item.text)
The details pages are much better designed and I can get structured data with proper separation:
Loading https://www.rixoshome.com.tr/en/property/new-apartments-in-avsallar-alanya-24/
Details
Updated on April 7, 2023 at 12:50 pm
Property ID:
RH-Emerald Grand Deluxe
Price:
155.000€
Property Size:
47 m²
Bedroom:
1
Rooms:
2
Bathroom:
1
Year Built:
2023
Property Type:
Apartment
Property Status:
For Sale
Balconys: 1
Now this data needs to be converted to Python dictionaries and saved for further processing. I'll rewrite this part of program again:
from bs4 import BeautifulSoup
print('Extracting data...')
urls = set()
for page in pages:
soup = BeautifulSoup(page, 'html.parser')
for item in soup.findAll(class_=['btn btn-primary btn-item'], href=True):
print(item['href'])
urls.add(item['href'])
parameters = []
for url in urls:
print()
print('Loading', url)
response = requests.get(url)
soup = BeautifulSoup(response.text, 'html.parser')
for section in soup.findAll(class_=['property-detail-wrap property-section-wrap']):
ul = section.find('ul')
for li in ul.findAll('li'):
param = li.text.replace('\n','')
param = param.split(':', 1)
parameters.append(param)
print(param)
Now I have something like this:
['Property ID', 'RH-Euro Avsallar']
['Price', ' 160.000€']
['Property Size', '44 m²']
['Bedroom', '1']
['Rooms', '2']
['Bathroom', '1']
['Year Built', '2023']
['Property Type', 'Apartment']
['Property Status', 'For Sale']
['Balconys', ' 1']
Much better now! Seems that it can be stored in SQLite database. But its not so simple because that data on web site was filled by humans and there are mistakes in formatting. For example if you downscroll the output you'll see:
['Distance to the sea', ' 3500 m']
['Distance to Local Centre ', ': 5000 m']
If I want to process this data I should remove this mess from the second fields which starts with Distance
word. Fortunately, I don't need the measurement letter m
here, so I can just remove all non-didgits from those fields.
Let's improve this for
loop again. First, I need the param.replace('²','')
to exclude square meters symbol from string because the list comprehension
will classify it as digit. But it cannot be converted in int()
data type.
param = param.replace('²','')
And this part will remove all non-digit letters and convert whats left into int()
if parameter name is not in certain value list.
if param[0] not in ['Property ID', 'Property Type', 'Property Status']:
param[1] = int(''.join(symbol for symbol in param[1] if symbol.isdigit()))
Here the final version of for
loop from the second part of Python scraping program:
for li in ul.findAll('li'):
param = li.text.replace('\n','')
param = param.replace('²','')
param = param.split(':', 1)
if param[0] not in ['Property ID', 'Property Type', 'Property Status']:
param[1] = int(''.join(symbol for symbol in param[1] if symbol.isdigit()))
parameters.append(param)
Now the data is in proper format:
['Property ID', 'RH-BTok - 22']
['Price', 201500]
['Property Size', 180]
['Bedrooms', 4]
['Rooms', 5]
['Bathrooms', 2]
['Property Type', 'Apartment']
['Property Status', 'For Sale']
['Distance to the sea', 900]
The web-scraping-dataset.py
program Github repository.
Saving The Data
The extracted data must be saved on disk to use more than once in future. I can save it in JSON
format as file or put it into SQLite
database.
Let's change the bs4
part of program to create structured dictionary instead of list because each apartment has different number of parameters described on the web site. It will be much easier to process the dictionary data in future.
The point is to make apartments = {}
dictionary with name and parameters. Thats why I changed parameters
list to dict datatype and moved it inside the for
loop.
apartments = {}
name = 0
for url in urls:
print()
print('Loading', url)
response = requests.get(url)
soup = BeautifulSoup(response.text, 'html.parser')
for section in soup.findAll(class_=['property-detail-wrap property-section-wrap']):
ul = section.find('ul')
parameters = {}
for li in ul.findAll('li'):
param = li.text.replace('\n','')
param = param.replace('²','')
param = param.split(':', 1)
if param[0] not in ['Property ID', 'Property Type', 'Property Status']:
param[1] = int(''.join(symbol for symbol in param[1] if symbol.isdigit()))
parameters[param[0]] = param[1]
apartments[name] = parameters
print(name, parameters)
name += 1
print(apartments)
The integer name
variable counts all loaded parameters. Thats how the result rows look:
Loading https://www.rixoshome.com.tr/en/property/new-apartments-in-avsallar-alanya-11/
45 {'Property ID': 'RH-Onurlife', 'Price': 88000, 'Property Size': 50, 'Bedroom': 1, 'Rooms': 2, 'Bathroom': 1, 'Property Type': 'Apartment', 'Property Status': 'For Sale', 'Balconys': 1}
Ok, now I can save the apartments
dictionary to JSON file.
import json
from datetime import date
dt = date.today().strftime("%Y-%m-%d")
filename = 'for-sale-' + dt + '.json'
with open(filename, 'w') as f:
json.dump(apartments, f, indent=4, sort_keys=True)
Here is an example of the output JSON file:
{
"0": {
"Bathroom": 1,
"Bedroom": 1,
"Price": 102000,
"Property ID": "RH-PINE GARDEN",
"Property Size": 57,
"Property Status": "For Sale",
"Property Type": "Apartment",
"Rooms": 2,
"Year Built": 2023
},
...
Now I can visualize this data.
Scraped Data Visualization
I wrote another program to visualize previously saved JSON dictionary.
Install pandas
library in virtual environment first:
pip3 install pandas
Lets load JSON file into pandas dataframe:
import pandas as pd
import matplotlib.pyplot as plt
df = pd.read_json('for-sale-2023-04-09.json', orient='index')
print(df)
P = df.plot(kind='scatter', x='Property Size', y='Price', color='red')
P.set_title('Prices scatter')
plt.show()
The orient='index'
is important to get normal columns in dataframe. Something like that:
Bathroom Bedroom Price ... Distance to Local Centre Floor Level Room
0 1.0 1.0 102000 ... NaN NaN NaN
2 1.0 1.0 120000 ... NaN NaN NaN
3 1.0 1.0 119000 ... NaN NaN NaN
If I will not use orient='index'
, the dataframe will be difficult to process. It will look like that:
0 1 ... 45 46
Bathroom 1 NaN ... 1 1
Bedroom 1 NaN ... 1 1
Price 102000 155000 ... 88000 105800
Lets see the prices distribution by an apartments size in square meters.
{width=100%}
Preliminary Analysis of Scraped Data
- Most apartments RH has for sale are mostly between 40 and 60 square meters area on the market, so called 1+1 apartments.
- Prices for 1+1 apartments are between 75 and 125 thousands euro.
That distribution of 1+1 apartments is most popular to rent during touristic season. Other apartments are for living and the price does not depend on hotel rooms season prices. That makes 1+1 apartments most interesting for the investment.
I see 5 more apartmetns in 40 - 60 square meters segment for 140 - 160 thousands euro as an exception. I assume those are close to the sea or has a sea view. Lets find out what exactly are those exceptional apartments. The program was improved to highlight Property ID
and Distance to the Sea
parameters.
...
df = pd.read_json('for-sale-2023-04-09.json', orient='index')
df.loc[(df['Price'] >= 125000) & (df['Property Size'] <= 65), 'Expensive'] = 'True'
df = df.loc[df['Expensive'] == 'True']
P = df.plot(kind='scatter', x='Property Size', y='Price', color='red')
P.set_title('Prices scatter')
P.set_xlim(left=40,right=65)
P.set_ylim(top=170000)
for index, row in df.iterrows():
print(index, row['Property ID'], row['Property Size'], row['Price'], row['Rooms'], row['Distance to the sea'], row['Year Built'])
hint = row['Property ID'] + '\n'
hint += str(row['Distance to the sea'])
plt.annotate(hint, (float(row['Property Size']), float(row['Price']) + 1500), color='#ffffff', ha='center', va="center")
...
The improved version has two important changes. First, the overall dataframe must be filtered by Price
and Property Size
values with df.loc[]
:
# Creating new column and mark all `Expansive` apartments.
df.loc[(df['Price'] >= 125000) & (df['Property Size'] <= 65), 'Expensive'] = 'True'
# Create new dataframe with expensive apartments only.
df = df.loc[df['Expensive'] == 'True']
And the for
iterator loop which helps to define additional labels and its coordinates for annotation.
{width=100%}
13 RH-Emerald Grand Deluxe 47 155000 2.0 nan 2023.0
19 RH-Euro Avsallar 44 160000 2.0 nan 2023.0
21 RH-Perli Mare 50 159000 2.0 700.0 nan
26 RH-McEmerland 60 129000 2.0 750.0 2014.0
28 RH-AvsallarGreenTowers 55 145000 2.0 nan 2023.0
42 RH-Nirvana Forest 2 53 161000 2.0 300.0 nan
I have no distance data for every apartment on this web site, but here I see a pattern.
- The
Distance to the Sea
parameters has a strong effect on price, more thanYear Built
. - All most expensive 1+1 apartments are located at 750 meters or closer to the sea.
- The most distant apartment - cheapest.
Next I will add distance to the sea measurement for overall dataframe. If you were attentive you could see in JSON dataset strange values in Distance to the Sea
field such as 10 or 14. It is impossible in Avsallar to build properties so close to the sea because there are many hotels on the beach already. That means values between 0 and 10 are in kilometers
and must be multiplied by a thousand.
All values between 10 and 30 actually looks not real and could appear by removal non-digit symbols from initial parsed dataset. Because there is no properties for sale farther then 3 km by RH real estate company. And closer than 100 meters to the sea I found villas and hotels only.
The improved version of pandas-viz-scraping.py
shows all dataframe with correct sea distance parameter annotation and rooms count if known. I will zoom the most dense and interesting sector with 1+1 apartments which has the size between 40 and 70 square meters and the price up to 120 thousands euro.
for index, row in df.iterrows():
print(index, row['Property ID'], row['Property Size'], row['Price'], row['Rooms'], row['Distance to the sea'], row['Year Built'])
dist = row['Distance to the sea']
if not pd.isna(dist):
dist = int(dist)
if dist <= 10: dist *= 1000
if dist > 10 and dist < 30: dist *= 100
hint = str(dist)
rooms = row['Rooms']
if not pd.isna(rooms):
hint = str(dist) + '/' + str(int(rooms))
plt.annotate(hint, (float(row['Property Size']), float(row['Price']) + 1500), color='#ffffff', ha='center', va="center")
{width=100%}
Apartments price prediction is non-linear because each district and each sete has different infrastructure and distance from the sea. That means all the data must be clusterized. The price prediction as well as invesment functions like IRR, NPV and others will be calculated with Python's NumPy library functions.
To calculate NPV I need to choose the Discount Rate
first. According the term for individual investor who is not working with WACC, the discount rate is the rate of return the investor could earn in the market on an investment of compatible size and risk. It's the opportunity the investor would be giving up if invested in the property of investment in question.
For me personally the alternative investment is a bank deposit with 1% effective interest rate.
Lets look at the investment case:
- I purchased an 1+1 apartment for 80 thousands euro at 2 km distance to the sea.
- I'll rent the apartment for 500 euro at least 10 months per year for next 3 years.
- I'll try sell the apartment for 80 thousands euro after all.
Will it be profitable?
import numpy_financial as npf
dr = 0.01
values = [-80000, 5000, 5000, 5000, 80000]
npv = npf.npv(dr, values)
print('NPV:', npv)
NPV: 11583.353594803091
The answer is YES, because a positive result from an NPV calculation means the project or investment may be profitable and worth pursuing.
A negative result from an NPV calculation means the project or investment is unlikely to be profitable and should probably not be pursued. And an NPV of zero means the project or investment is neither profitable nor costly. A company may still consider projects and investments with an NPV of zero if the project has significant intangible benefits, such as strategic positioning, brand equity, or increased consumer satisfaction.
IRR as the rate of growth that an investment is expected to generate annually.
import numpy_financial as npf
irr = npf.irr(values)
print('IRR:', irr)
IRR: 0.04795510285010396
The 6% and more is usually considered as good IRR value, but here I have 4.7% and it's OK. A low IRR and high NPV is indicating that although the rate of return may be slower than other projects, the investment itself is likely to yield significant value for your investment.
The 5000
euro per year was taken just for an average example. I met one guy who wanted to invest in 2021 and 2022 and talked with real estate agencies. The offer him average income after paying all taxes and management share approx 4500 - 5500 euro. I should calculate this part more accurately later.
Risks
The Avsalar real estate market is overheated now because of war in Ukrain. Many Russian and Ukrainian moved to Turkey to live or wait for the end of war. The next 3 years situation may change, the market may cool down. Thats why I set a sell price the same as purchase price. The lower price to exit will be ~68 thousands euro because in this case it will just save your money without any profit.
Telegram public chat messages
The overall amount of active foreigners in russian-speaking communities members in Telegram grew up in 2022. And many real estate agents use this public chats to sell apartments. In this part I will try to get some information from one of those chats and visualize the same way as in web scraping before.
Getting for sale
messages
I use Telethon library to connect to a Telegram API server and get the public channel messages. Telegram API allows to download up to 200 000 last messages from public chats. To use this program you should get your ID
and access token
(hash) from Telegram Apps web site.
In russian language прод
is a root for all sale words. The word продам
means I sale
and the word продается
means for sale
. To filter all for sale
similar messages I should use the first part of all sale words in russian прода
in combinations with квартира
which means an apartment.
The tg-ublic-chat-dl.py
program Github repository.
As a result I have 98.5 KiB (100,828)
structured JSON file with latest for sale messages from Avsallar Telegram public chat. It seems that it's much more plentiful data source than real estate company web site (not so much). There must be more apartments for sale, but the agents made those messages insufficiently informative.
{
"180878": [
"2023-03-18 06:15:53+00:00",
"🔥 Авсаллар 🔥 \n\nПродаётся новая квартира в резиденции UYAN GÖKSU \n\n🌅 1+1, 53 м2\n🌅 С видом на море\n🌅 Центр Авсаллара\n🌅 В шаговой доступности от сетевых магазинов\n🌅 В комплексе есть: открытый бассейн, тренажерный зал\n🌅 Все документы на руках\n🌅 Подходит для инвестиций\n🌅 Цена ниже рыночной \n🌅 Дешевле, чем у застройщика на 26.000€\n🌅 Расстояние до моря 400 м\n🌅 Дата окончания строительства: 01.05.2023\n\n💶 88.500 €\n#Аланья #Алания #новостройка #продажа"
],
...
As always there are some problems to solve:
- Duplicate messages posted often by human agents.
- Different prices notation like
88.500 €
,88.500€
or88.500 евро
(on russian). Must be converted to integer values.
Those problems make me to write post-processing program. Otherwize I can not visualize the data effectively.
Remove duplicate messages
Real estate agents in Telegram chat are always posting the same messages until apartment been sold. Those duplicates must be removed from the JSON
file with tg-dataset-dedup.py
program.
Here we have the real for sale
messages stats:
Total messages: 133
Duplicate messages removed: 92
Not much. But it correlates with the RH real estate agency for sale
apartments data from the agency web site.
Converting messages to dictionaries
Previously I've got some data from the RH real estate agency web site. I spent some time to make it structured and good to visualize.
"1": {
"Bathrooms": 2,
"Bedrooms": 2,
"Price": 155000,
"Property ID": "RH-Roma - 006",
"Property Size": 100,
"Property Status": "For Sale",
"Property Type": "Apartment",
"Rooms": 3
},
The recommended structure is dictionary of dictionaries. And I need to convert the Telegram chat messages from dictionary of lists to this structured form.
"54210": [
"2022-08-01 08:20:57+00:00",
"Срочная продажа! Люкс комплекс с полной инфраструктурой!Инвест.проект.В Авсалларе 1+1,в строящемся доме. По цене, существенно ниже, чем у застройщика. Цена 74000 €🔥.Один дом на 56 квартир. У застройщика цена от 94500€. Дом уже строится, этаж где расположена квартира построен. Сдача дома. 04.2023."
],
There are 1+1
and 2+1
apartments only in Telegram chat for sale. I am not interested to process messages without rooms count as well as apartments without square meters parameter.
Problems to solve:
- The apartment area in square meters
м2
can be mentioned as50 м2
or as50м2
. - Some messages has the apartment area with
55/60
values means netto/brutto specification. - The
фактически)\nбрутто:90,нетто-85.\n2
notation problem. - The price mentioned as
125.000 €
or as125000€
. - Sometimes apartment specification has or hasn't rooms number and square meters size.
To solve those problems the tg-fields-parser.py
was writen.
The result JSON
file has the same structure as web scraping JSON
dataset.
{
"101615": {
"Date": "2022-11-08 13:40:16+00:00",
"Price": 130000,
"Property Size": 52,
"Rooms": 2
},
...
Telegram data visualization
Using Python pandas
library I can make a plot with the Telegram public chat for sale
messages using pandas-viz-telegram.py
program.
{width=100%}
On this scatter there are apertments which has Property Size
column data in Telegram dataset. It could be more points on scatter, but size it very important parameter for making decisions.
...
173593 2023-03-05 13:29:32+00:00 88500 53.0 2.0
181427 2023-03-19 09:27:20+00:00 105000 72.0 NaN
183979 2023-03-23 12:02:39+00:00 155000 90.0 3.0
23407 2022-04-20 10:17:54+00:00 85000 NaN 2.0
23580 2022-04-21 08:24:56+00:00 85000 NaN 2.0
47966 2022-07-12 13:35:22+00:00 95000 NaN 2.0
61661 2022-08-24 07:43:09+00:00 67000 NaN 2.0
61865 2022-08-24 15:24:08+00:00 67000 NaN 2.0
66977 2022-09-09 09:30:52+00:00 122000 47.0 2.0
...
I see interesting niche at approx 70 m2
which has the same price as RH agency has for 40-60 m2
apartments. Which means those points may be secondary housing or the fix & flip
investment apartments in old buildings.
Sahibinden Web Scraping
Sahibinden is a popular web marketplace where people trade: cars, apartments, computers, anything.
This marketplace is using JavaScript for dynamical content generation and CloudFlare bot protection to avoid DoS attacks on site. That makes me to change my requests
library approach and use more powerful selenium
library to get apartment data from this site. Also I have to use undetected-chromedriver
library to avoid CloudFlare detection of selenium
library usage and pretend human.
pip3 install selenium undetected-chromedriver
The web-scraping-dataset-sb.py
program uses Chrome browser to emulate human actions, accept all cookies and list 10
pages with for sale
apartments.
The result lists was saved in 142.2 KiB (145,590)
JSON file. This dictionary must be converted in pandas
compatible format, but the advantage here is that I have 478
apartments for sale
!
{
"0": [
"/listing/emlak-konut-satilik-antalya-alanya-da-83-m2-2-plus1-satilik-daire-kurumsal-firmadan-1008734570/detail",
"Antalya Alanya'da 83 m2 2+1 Satılık Daire Kurumsal Firmadan",
"83",
"2+1",
"1,925,000 TL",
"16 April2023",
"Avsallar"
],
...
This dictionary has no duplicate messages, that issue was solved automatically by Sahibinden moderation engine.
Converting Sahibinden Data
As you can see the data is in Tukish language and already well structured.
There are two converting operations I need to do:
- Transform the lists into dictionaries for
pandas
visualization compatibility. - Covert the prices from Turkish Lira to Euro.
As always some people set non standard notation such as Studio Flat
or 4.5+1
that can be corrected manually:
rooms = data[key][3]
if rooms == 'Studio Flat (1+0)':
rooms = '1'
elif rooms == '4.5+1':
rooms = '5'
else:
rooms = sum(int(i) for i in data[key][3].split('+'))
internalDict['Rooms'] = rooms
I do not use eval()
because the input may be unpredictable.
The prices convertation code:
rate = 0.047
price = int(''.join(symbol for symbol in data[key][4] if symbol.isdigit())) * rate
internalDict['Price'] = int(price)
Here is the sb-converter.py
program.
The result dataset in JSON
file is now compatible and can be merged with previous pandas
visualization.
{
"0": {
"Date": "16 April2023",
"Description": "Antalya Alanya'da 83 m2 2+1 Satılık Daire Kurumsal Firmadan",
"Link": "/listing/emlak-konut-satilik-antalya-alanya-da-83-m2-2-plus1-satilik-daire-kurumsal-firmadan-1008734570/detail",
"Location": "Avsallar",
"Price": 90475,
"Property Size": "83",
"Rooms": 3
},
...
}
Sahibinden Data Visualization
The Sahibinden data was added using Python pandas
library in pandas-viz-sb.py
program.
{width=100%}
Date Description Link ... Price Property Size Rooms
0 2023-04-16 Antalya Alanya'da 83 m2 2+1 Satılık Daire Kuru... /listing/emlak-konut-satilik-antalya-alanya-da... ... 90475 83 3
1 2023-04-16 ALANYA AVSALLAR SERA LİFE 2 DEN SATILIK DAİRELER /listing/emlak-konut-satilik-alanya-avsallar-s... ... 85070 75 3
10 2023-04-15 AVSALLAR MERKEZDE SATILIK SIFIR DAİRE /listing/emlak-konut-satilik-avsallar-merkezde... ... 115150 65 2
100 2023-04-10 ALANYA AVSALLARDA SIFIR VİLLA UYGUN FİYAT TAKA... /listing/emlak-konut-satilik-alanya-avsallarda... ... 646250 200 5
101 2023-04-10 ALP ASİSTANLIK'TAN ANTALYA AVSALLARDA SATILIK ... /listing/emlak-konut-satilik-alp-asistanlik-ta... ... 82250 65 2
Sahibinden data has a large spread and must be clusterized by a distance from sea. I have and idea how to make the certain apartment point more interesting:
- To expand the dictionary with additional parameters.
- Filter the apartments by distance to the sea and other parameters.
- Connect the certain apartment to the rent prices in the same district.
Improving Dataset With Coordinates
The apartments page has no details with sea distance directly as table field or something. But it has Google Maps coordinates in id='gmap;
div attribute. I'll take it and calculate distance to the sea with this coordinates.
Here is the improved web-scraping-dataset-sb-coords.py
program wich works much slower, but it can open the Google Map tab on details page and expract the coordinates.
{
"0": [
"/listing/emlak-konut-satilik-alanya-avsallarda-satilik-3-plus1-site-ici-havuzlu-lux-dubleks-daire-1093756292/detail",
"36.63322994314836",
"31.773725651762714",
"ALANYA AVSALLARDA SATILIK 3+1 SİTE İÇİ HAVUZLU LÜX DUBLEKS DAİRE",
"180",
"3+1",
"3,490,000 TL",
"18 April2023",
"Avsallar"
],
...
}
Lets take a look at the converted dataset made by sb-converter-coords.py
program.
{
"0": {
"Link": "/listing/emlak-konut-satilik-alanya-avsallarda-satilik-3-plus1-site-ici-havuzlu-lux-dubleks-daire-1093756292/detail",
"Lat": "36.63322994314836",
"Lon": "31.773725651762714",
"Description": "ALANYA AVSALLARDA SATILIK 3+1 SİTE İÇİ HAVUZLU LÜX DUBLEKS DAİRE",
"Property Size": "180",
"Rooms": 4,
"Price": 164030,
"Date": "18 April2023",
"Location": "Avsallar"
},
...
}
The further task is to put all the objects on the map and maybe caclulcate the distance from the sea. Let's see what OpenStreeMap can offer to solve this problem. Thanks to Brayan Brattlof's notes.
To make updating and sharing their work easier, OpenStreetMap have split their maps into billions of tiny (256 pixel) sections, called tiles, that we can download individually from their tile servers.
OpenStreetMap has quite a few tile servers that style or prioritize different map features with some having a slightly different API to request tiles. The default tile server's API to request tiles:
URL = "https://tile.openstreetmap.org/{z}/{x}/{y}.png".format
OpenStreetMap Tile Usage Policy
Map Projections
Using sb-converter-mercator.py
I converted the gmap
GPS coordinates into Web Mercator coordinates to store them:
{
"0": {
...
"Lat": "36.63322994314836",
"Lon": "31.773725651762714",
...
"Location": "Avsallar",
"X": 39436,
"Y": 26662
},
...
}
This function converts GPS coordinates Longitute and Latitude to web mercator points (flat coordinates).
import math
TILE_SIZE = 256
def point_to_pixels(lon, lat, zoom):
r = math.pow(2, zoom) * TILE_SIZE
lat = math.radians(lat)
x = int((lon + 180.0) / 360.0 * r)
y = int((1.0 - math.log(math.tan(lat) + (1.0 / math.cos(lat))) / math.pi) / 2.0 * r)
return x, y
Downloading Tiles
Use pandas-viz-sb-1tile.py
program to download a tile for certain aparatment with coords (36.63322994314836, 31.773725651762714)
. Don't forget to define User-Agent
header because of OpenStreeMap policies:
from io import BytesIO
from PIL import Image
import requests
zoom = 16
x, y = point_to_pixels(36.63322994314836, 31.773725651762714, zoom)
# format the url
URL = "https://tile.openstreetmap.org/{z}/{x}/{y}.png".format
url = URL(x=x_tiles, y=y_tiles, z=zoom)
headers = {
'User-Agent': 'Mozilla/5.0 (platform; rv:geckoversion) Gecko/geckotrail Firefox/firefoxversion',
}
# make the request
with requests.get(url, headers=headers) as resp:
resp.raise_for_status() # just in case
img = Image.open(BytesIO(resp.content))
# plot the tile
plt.imshow(img)
plt.show()
{width=50%}
{width=50%}
Plotting The Data
To download all the tiles needed for the plot basemap, I need to calculate the limits of the data.
top, bot = df.Lat.astype(float).max(), df.Lat.astype(float).min()
lef, rgt = df.Lon.astype(float).min(), df.Lon.astype(float).max()
This gives me a bounding box (in GPS coordinates) that encompasses my entire data-set. Set zoom = 15
to reduce servers utilization with less amount of tiles to download. And use point_to_pixels
function to convert the GPS coordinates into Web Mercator coordinates.
x0, y0 = point_to_pixels(lef, top, zoom)
x1, y1 = point_to_pixels(rgt, bot, zoom)
Then divide by TILE_SIZE = 256
to calculate the minimum and maximum number of tiles I'll need to download for both the {x}
and {y}
arguments for the API. Use math.ceil()
to round up, assuring small fractions of a tile will still be downloaded.
x0_tile, y0_tile = int(x0 / TILE_SIZE), int(y0 / TILE_SIZE)
x1_tile, y1_tile = math.ceil(x1 / TILE_SIZE), math.ceil(y1 / TILE_SIZE)
As a precaution, add an assert
statement to limit the number of tiles to download and save from the embarrassment of accidentally burdening OpenStreetMap tile servers.
assert (x1_tile - x0_tile) * (y1_tile - y0_tile) < 50, "That's too many tiles!"
Use itertools buit-in product()
function to loop through every tile, downloading and saving it to a single large pillow image.
from itertools import product
img = Image.new('RGB', (
(x1_tile - x0_tile) * TILE_SIZE,
(y1_tile - y0_tile) * TILE_SIZE))
for x_tile, y_tile in product(range(x0_tile, x1_tile), range(y0_tile, y1_tile)):
URL = "https://tile.openstreetmap.org/{z}/{x}/{y}.png".format
headers = {
'User-Agent': 'Mozilla/5.0 (platform; rv:geckoversion) Gecko/geckotrail Firefox/firefoxversion',
}
with requests.get(URL(x=x_tile, y=y_tile, z=zoom), headers=headers) as resp:
resp.raise_for_status()
tile_img = Image.open(BytesIO(resp.content))
img.paste(
im=tile_img,
box=((x_tile - x0_tile) * TILE_SIZE, (y_tile - y0_tile) * TILE_SIZE))
Using math.ceil()
and int()
functions to round the pixel coordinates into {x}
and {y}
tiles made the result image unsuitable to match with scatter points next.
Crop the image to get correct pixel coordinates. Multiply tile coordinates x0_tile, y0_tile
by TILE_SIZE
to find the pixel coordinates for the top-left corner of the current (oversize) basemap.
x, y = x0_tile * TILE_SIZE, y0_tile * TILE_SIZE
Subtracting the edges is different than Bryan Brattlof made because of different hemisphere.
img = img.crop((
int(x0 - x), # left
int(y0 - y), # top
int(x1 - x), # right
int(y1 - y))) # bottom
Plotting the dataset.
fig, ax = plt.subplots()
ax.scatter(df.Lon.astype(float), df.Lat.astype(float), alpha=0.5, c='red', s=10)
Here add an extra argument to the imshow()
function to properly locate the image in the final visual. The extent
argument is used to move a image to a particular region in dataspace.
ax.imshow(img, extent=(lef, rgt, bot, top))
Lock down the x
and y
axes to the limits defined a few sections ago by using the set_ylim()
and set_xlim()
functions.
ax.set_ylim(bot, top)
ax.set_xlim(lef, rgt)
Finally, the plot with 472 for sale
objects generated by pandas-viz-sb-coords.py
.
{width=100%}
Some of those coordinates may be set wrong because people choose it for the source web site manually. But it's OK to have a general idea of what is happening on the merket.
It's good to have an improved visualization program that can store tiles to reduce the similar requests to the OpenStreetMap servers.
Now I can filter
for sale
apartments and see the prices location.
151 objects with price < 100 000 EUR{:target="_blank"}
filter = df["Price"] < 100000
df = df[filter]
18 objects with price < 100 000 EUR and 3 rooms{:target="_blank"}
filter = (df["Price"] < 100000) & (df["Rooms"] == 3)
df = df[filter]
14 objects with price < 100 000 EUR and property size > 80 m2{:target="_blank"}
filter = (df["Price"] < 100000) & (df["Property Size"] > 80)
df = df[filter]
The oldest buildings are mostly closer to the sea, but there are some renovated houses on the first line too.
More than 100 houses was built in 2022 and 2023 and they are located in the upper town. You can see a lots of objects on the basemap, but the map is not ready yet like there is no buildings at all. Those are the newest constructions.
Why some apartments closer to the sea has the same price as others in the center? The main D400
road is too noisy to live. huge trucks roar there from morning to the night. The first floors on the closest to the road buildings are often stores. The buildings are old.
{width=100%}
Trunk Highway And Coastline
The idea is to get coastline points for the avsallar map segment and use geometry to find out the closest point to the object coordinates. Then calculate the distance.
pip3 install overpy
The way
query parameter accepts coordinates in south, west, north, east
order. My primary task is to calculate distance to the sea, but the coastline will be the next case. Avsallar as all Turkish coast tows has D-400
highway right near the sea. I'll get all highway waypoints with this ["highway"~"^(trunk)"]
filter.
result = api.query("""
way({0},{1},{2},{3}) ["highway"~"^(trunk)"];
(._;>;);
out body;
""".format(bot,lef,top,rgt))
The 40 result points will return in two nodes because the trunk highway has 2 ways:
Name: n/a
Highway: trunk
Nodes:
Lat: 36.632176, Lon: 31.755997
Lat: 36.629589, Lon: 31.758771
Lat: 36.628797, Lon: 31.759652
...
Name: n/a
Highway: trunk
Nodes:
Lat: 36.616360, Lon: 31.773364
Lat: 36.616802, Lon: 31.772872
Lat: 36.618358, Lon: 31.771157
...
I asked #osm
community at irc.oftc.net
about coastline. Here is the query will get all coastline coordinates:
result = api.query("""
nwr({0},{1},{2},{3}) [natural=coastline];
(._;>;);
out body;
""".format(bot,lef,top,rgt))
The good advice I got from IRC community is to store coastline coords locally to reduce server requests for the same data.
import json
clFile = open('coastline.json', "w")
json.dump(coastline, clFile, indent=4, ensure_ascii=False)
clFile.close()
In this particular case API returns 3 ways
of nodes because for Avsallar the OSM has two additional objects in database: the small island and the bay. It is possible to join all nodes in one list for further programming planar calculation of the nearest points. The pandas-viz-sb-coast.py
program used to put the property objects, the coastline and the highway points on single map.
{width=100%}
Calculating Distance With Haversine
This uses the haversine
formula to calculate the great-circle distance between two points – that is, the shortest distance over the earth’s surface – giving an as-the-crow-flies distance between the points.
a = sin²(Δφ/2) + cos φ1 * cos φ2 * sin²(Δλ/2)
c = 2 * atan2( √a, √(1−a) )
d = R * c
Where φ
is latitude, λ
is longitude, R
is earth’s radius (mean radius = 6,371km); note that angles need to be in radians to pass to trig functions.
For the haversine distance I can use BallTree
and I need to convert GPS coordinates to radians first. I wanted to start with KDTree
algorithm. But it does not support haversine distance, which is better suited for latitude/longitude coordinates.
from math import radians
from sklearn.neighbors import BallTree
import numpy as np
def nearest(p, targets):
p_rad = [(radians(p[0][0]), radians(p[0][1]))]
print(p, p_rad)
targets_rad = np.array([[radians(x[0]), radians(x[1])] for x in targets ])
tree = BallTree(targets_rad, metric = 'haversine')
result = tree.query(p_rad)
target_point = result[1].tolist()[0][0]
earth_radius = 6371000 # meters in earth
distance = result[0][0] * earth_radius
return target_point, round(distance[0])
This function uses [(float(row['Lon']), float(row['Lat']))]
propery GPS coordinates and the coastline or highway points array as an arguments. And returns the closest coastline/highway point to that property object and the distance between propery coordinates. Here I just need to print
nearest cloastline points for each df property object and draw the shortest distance lines using for loop
.
clArray = []
for key, value in coastline.items():
clArray.append((value['Lon'], value['Lat']))
for index, row in df.iterrows():
prop = [(float(row['Lon']), float(row['Lat']))]
index, clDist = nearest(prop, clArray)
conn = ConnectionPatch(
xyA=prop[0], coordsA='data',
xyB=clArray[index], coordsB='data',
arrowstyle='-', fc="black", alpha=0.1)
ax.add_artist(conn)
print(clArray[index], clDist)
You can dive deeper with Python code in padnas-viz-sb-haversine.py
program. Here is the final map with direct distance lines.
{width=100%}
To save all the coastline and the highway neareset points and distance measurement I wrote the separate sb-converter-coastline.py
program. Here is how the updated data structure looks like:
{
"0": {
"CoastLat": 36.6209697,
"CoastLon": 31.765299,
"Date": "18 April2023",
"Description": "ALANYA AVSALLARDA SATILIK 3+1 SİTE İÇİ HAVUZLU LÜX DUBLEKS DAİRE",
"HighwayDistance": 1280,
"HighwayLat": 36.6222903,
"HighwayLon": 31.7669473,
"Lat": "36.63322994314836",
"Link": "/listing/emlak-konut-satilik-alanya-avsallarda-satilik-3-plus1-site-ici-havuzlu-lux-dubleks-daire-1093756292/detail",
"Location": "Avsallar",
"Lon": "31.773725651762714",
"Price": 164030,
"Property Size": "180",
"Rooms": 4,
"SeaDistance": 1490,
"X": 39436,
"Y": 26662
},
...
}
Real estate agents use the direct distance measurement to attract customers. But there is more accurate way to measure the distance - OSRM routing engine
.
Calculating Distance With OSRM (not ready)
Calculate distance between property and coastline can be made in combination of using the k-d tree
algorithm on points to get closest coastline point to the property object and using OSRM
routing engine to calculate the distance between property object and the way segment.
It is better to use highway D-400 instead of coastline when calculating distance with routing engine.
here: Direct distance notes.
here: Routing
...
Statistics
Using pandas
library the median
, the mean
also known as avg
and the min
and max
values can be calcutaled for the whole columns.
Square Meters (m²) Prices
Rooms | Median | Min | Max | Mean | Objects |
---|---|---|---|---|---|
1+0 | 1849 | 1762 | 1974 | 1848 | 5 |
1+1 | 1624 | 873 | 2782 | 1694 | 233 |
2+1 | 1299 | 610 | 3958 | 1358 | 166 |
3+1 | 968 | 588 | 2350 | 1100 | 35 |
Property Prices (€)
Rooms | Median | Min | Max | Mean | Objects |
---|---|---|---|---|---|
1+0 | 86.010 | 65.142 | 105.750 | 85.784 | 5 |
1+1 | 95.880 | 61.053 | 188.000 | 99.177 | 233 |
2+1 | 131.600 | 81.075 | 376.000 | 136.823 | 166 |
3+1 | 183.300 | 106.925 | 293.750 | 183.038 | 35 |
Sometimes people are trying to sell 4 or 9 apartments at once. I removed a couple of those items from the 1+1 category.
There are three ways to calculate square meter price with Python.
Numpy Library:
df['m2Price'] = np.where(df['Price'] is np.nan, np.nan, df['Price']/df['Property Size'])
Using Python's lambda
function:
df['m2Price'] = df.apply(lambda x: np.nan if x['Price'] is np.nan else x['Price']/x['Property Size'], axis=1)
Using DataFrame method loc
:
df.loc[df['Price'] != np.nan, 'm2Price'] = df['Price']/df['Property Size']
df.loc[df['Price'] == np.nan, 'm2Price'] = np.nan
I personally prefer to use the third way.
How about three measurements in one scatter using matplotlib
in Python?
{width=100%}
This distribution shows that the Avsallar town is too small for the distance to significantly change prices. Lets look closer on 1+1 apartments with the price between 90 and 95 thousands EUR. I see that the property size is more significant factor than the distance to the sea.
{width=100%}
{width=50%}
{width=50%}
Square meters vs comfortable infrastructure!
Guarded residential complex with several blocks, outdoor and indoor pools, hamam, sauna, cafe, playgrounds, parking and other comfortable features is much more valuable for the foreigner buyers. At the same time we see on the market apartments in ordinary buildings setes
without any of those features. That is why we see the same price for the different property size and distance to the sea. For example: The new residential complex construction 1+1 50 m2 apartment price will be about 90.000 Euro and the ordinary apartment in the town center with 65 m2 size may have the same price.
There is a separated Apartment Complex Name
field on Sahibinden for that information. But it may not always be filled by author. It may be anywhere in description or not be at all. The good technical solution for that situation could be creating the residential complexes databse storing parameters and coordinates. The dataset can be parset for the coordinates match and detect the residential complex location automatically.
First outcome
Using matplotlib and pandas Python libraries I visualized the data with the distance from the sea. What I have now is an interesting piece of data. The Avsallar grows out from the coastline to the mountain forests. The current distance from the coast is about 2500 meters. Most people foreigners want to live in residential complexes, locals prefer the ordinary stand-alone buildings in the center.
The town is now at the peak of development and is on the verge of overproduction. More than 100 new buildings have been built. The city has doubled in size in the last year.
The Mediterranean Sea calm and peaceful compared to the ocean. But the D-400 trunk highway built right near the sea. That is why the distance from the sea is not so important as property size or complex type in small Antalya coastal town like Avsallar. And it looks like the neutral factor.
In addition, the Turkish Lira currency value is hard to predict and has been depreciating lately I don't know exactly why.
The whole apartments market is divided into two segments. Apartments in old ordinary houses, which are larger and cheaper. And residential complexes with more expensive apartments. Smaller size, but convenient infrastructure and a protected area, which is important for foreign families with children.
Finally, the factors impacting the prices in descending order:
- Residential complex or ordinary building.
- Property size.
- Sea view.
- Distance to the sea.
Will the prices grow more?
I think we're on the peak now because there was two waves of Russian foreigners. Georgia, Armenia and Serbia density of foreigners migration is critical now. European Union is closed for the regime supporters. If there will be a third, it will not be so massive as before. I don't have any stats data about sold apartments in Avsallar now. All I have is the season rental prices. To answer this question I need to collect historical rental median data.
But the Avsallar is closed for foreigners from 2022. You can not get a permission to live here more than 3 months. That means the only purpose to spend money here on apartments is an investment.
Most Important Metrics In Real Estate
Return On Investment (ROI)
In real estate, your property’s Return On Investment (ROI)
is the total amount of earnings you receive after subtracting all of your expenditures and loan payments. In short, it’s your reward and the sum of your investment. At its most basic level, it’s simply defined as profit or income minus expenses. The calculation is rather simple: (Gain on Investment – Cost of Investment) / Cost of Investment
. Another way to determine ROI real estate is as follows: ROI % = (Net return on investment) / (cost of investment) * 100
Here's an example of a rental property purchased with cash.
You paid 100,000 EUR in cash for the rental property. The closing costs were 1,000 EUR and remodeling costs totaled 9,000 EUR, bringing your total investment to 110,000 EUR for the property.
A year later. You collected 500 EUR in rent every month. You earned 6,000 EUR in rental income for those 12 months. Expenses including the water bill, property taxes, and insurance, totaled 1,200 EUR for the year or 100 EUR per month. Your annual return was 4,800 EUR (6,000 – 1,200).
To calculate the property’s ROI divide the annual return (4,800 EUR) by the amount of the total investment, or 110,000 EUR.
ROI = 4,800 EUR / 110,000 EUR = 0.043 or 4.3%
Here are some applicable expanses you should consider:
- Property taxes
- Insurance
- Property management fees
- Utilities
- Maintenance and improvements
- Closing costs and other fees
- Pest control services
For new apartment in Avsallar the situation quite different. You paid 100,000 EUR for 1+1 (2 rooms) apartment. The closing costs is already considered by agency and there is no remodelling cost. Your total investment is 100,000 EUR only.
There are two possible ways to rent this apartment. Assuming you bought the apartment in January and you have at least 10 months of rent for permanent resident for 500 EUR per month.
A year later. You earned 5,000 EUR in rental income for those 10 months. Expenses for the property taxes (aydat
) and insurance, totaled 1,000 EUR for the year. The water and electric bills usually are paid by a resident. Your annual return was 4,000 EUR (5,000 - 1,000).
ROI = 4,000 EUR / 100,000 EUR = 0.04 or 4%
The second scenario is a season rental income April till October for tourists which is much higher - ~800 EUR monthly. Let's assume you'll have a resident for 6 hot months and it will earn you 4,800 EUR for the season. You will pay property taxes and insurance for the whole year anyway: - 1,000 EUR. But you have found a resident for the 4 winter months and gain 2,000 EUR more.
ROI = 5,800 EUR / 100,000 EUR = 0.058 or 5.8%
But there is a risk that you will not rent an apartment for all these months. Even in season most foreigners will come to 1-2 furlough months only. It's up to you to decide which way to choose.
The property tax, electric and water bills in Avsallar depends on the residental complex. Proerty tax depends on the property size.
Anyway, you can sign contract with local real estate agency to avoid all the management complexity.
I have some aydat
property tax data collected by residential complexes:
Residential Compex | Property Size | Tax (monthly) |
---|---|---|
Orion Resort | 1+1 | 30 EUR |
Orion Resort | 2+1 | 40 EUR |
Emerald Park | 2+1 | 60 EUR |
Emerald Dream | 1+1 | 60 EUR |
Elma Homes | 2+1 | 30 EUR |
The 1+1 property statistics:
> Median m² price: 1624
Min m² price: 873
Max m² price: 2782
Mean (avg) m² price: 1694
> Median Property price: 95880
Min Property price: 61053
Max Property price: 188000
Mean Property price: 99177
Avg Distance to the sea (m): 1135
Objects: 233
ROI: 5.21%
Avg annual NRI: 5000
To accurately calculate the ROI I need to create map segments with residential complexes database. Each complex must have the
aydat
costs, water and electric prices for both summer and winter.
Net Operating Income (NOI)
NOI = Gross Rental Income - Operating Expenses
Net Operating Income (NOI)
is a measure of the income generated by a property after deducting all of the property's operating expenses. Gross Rental Income
refers to the total income generated by the property from rent and other sources, and Operating Expenses
refer to all of the expenses associated with operating and maintaining the property.
Cost of Investment, on the other hand, refers to the total cost of acquiring and maintaining a real estate investment. This includes the purchase price of the property, closing costs, renovation expenses, and any ongoing maintenance or repair costs. The cost of investment is important because it directly affects the amount of money an investor needs to put into the property to generate a return.
Remember that part from ROI calculation when new new apartment in Avsallar has no renovation cost and closing cost is already included in property price? Then you collected 500 EUR in rent every month during one year. You earned 6,000 EUR in rental income for those 12 months. Expenses including the water bill, property taxes, and insurance, totaled 1,200 EUR for the year or 100 EUR per month. This is exactly the NOI
per year: 4,800 EUR (6,000 – 1,200).
To accurately calculate the ROI I need to create map segments with residental complexes database. Each complex must have the
aydat
costs, water and electric prices for both summer and winter.
Capital Rate (cap rate)
In real estate, the capitalization rate (or cap rate) is a measure used to estimate the potential return on investment for a property. It is calculated by dividing the property's net operating income (NOI) by its current market value.
The formula for cap rate is as follows:
Cap Rate = Net Operating Income / Current Market Value
The net operating income is calculated by subtracting the operating expenses from the gross rental income. The current market value
is the price that the property would sell for on the open market.
If you have a 100,000 EUR property. You made some improvement works and bought new technics maybe. The market proprty price may grow and the cap rate
will be something like 4,800 / 110,000 = 0.043
.
Cash flow
In real estate, "cash flow" refers to the net amount of cash generated by a property after all income and expenses have been accounted for during a specific period of time. It represents the actual cash inflows and outflows associated with owning and operating the property.
To calculate the cash flow, you start with the gross income generated by the property, which includes rental income and any other sources of income such as parking fees or laundry fees. From this gross income, you subtract all the operating expenses, which can include property taxes, insurance, maintenance costs, utilities, property management fees, and mortgage payments if applicable.
The resulting amount is the net cash flow, which indicates how much cash the property is generating (or consuming) during that period. A positive cash flow means that the property is generating more income than it is costing to operate, while a negative cash flow indicates that the expenses are higher than the income.
Positive cash flow is generally desired by real estate investors as it provides income and can contribute to the property's profitability. It can be used to cover operating expenses, mortgage payments, and potentially provide a return on investment. Negative cash flow, on the other hand, means that the property is not generating enough income to cover its expenses, which may require additional funds from the owner.
Cash flow is an essential metric for real estate investors as it helps assess the financial performance and sustainability of an investment property. It considers the actual cash position rather than just accounting profits, providing a more accurate measure of the property's profitability and ability to generate income.